6 Replies Latest reply on Jan 27, 2014 9:43 AM by @Six_Steven

    Xml indents vs tabs in InDesign: whitespaces & linebreaks

    @Six_Steven

      Hi hello!

       

      My name is Steven and I have an issue concerning working with XML for use in InDesign.

      I've made a valid .xml document structured with tags, so I can use the tags for rendering the lay-out (using InDesign styles).

      The mapping works well, but the hangups are whitespaces and line breaks.

       

      Now, what's a bit confusing is the fact that when I import the XML file into my InDesign document, it renders aswell the used tabs of my XML file.

      Basically InDesign seems to pick up whatever whitespace and linebreaks are found in the XML files.

      But these are meaningless to XML and HTML, so why does InDesign treat and XML file as if it's a formatted document?

       

      I found some whitespace handlings such as <?whitespace-handling use-characters?> but they don't work for me...

      Perhaps it's something I'm doing wrong? I just want InDesign to keep the linebreaks I'm using in my XML file, but not to use the tabs i'm using in XML as they just are necessary to keep the overview (legibiltiy) of my structured XML file.

       

      If someone wants so snif in my project it's possible by using the following link:

      https://www.dropbox.com/s/oaqk31yr1aykf5y/xml-sip.zip

       

      I appreciate your comments!

       

      Othere related/simular issues: http://www.justskins.com/forums/xml-import-whitespace-and-78216.html

        • 1. Re: Xml indents vs tabs in InDesign: whitespaces & linebreaks
          MW Design Level 4

          Basically ID has "dumb" XML import handling. At least if you think you can just drag an imported XML file onto the page and call it good.

           

          You have a couple options. One is to make a run-in template. Then you can simply allow ID to strip the whitespace (the single looking paragraph result when the XML is simply plopped onto the page). I sometimes have issues with the stripping in any case.

           

          Or you can process the XML so line breaks are as desired and any desired whitespace (like tabs within an element and or separating elements within the same paragraph. Which method I do depends more on what I expect to do following the import. But I usually am using this second method in part because of what I use ID for with XML.

           

          Often times my instructions to whomever is producing the XML as to how I want it formatted doesn't end up getting to me correctly. So I use a text editor for post-XML generation from a database (using saved Find/Replaces and macros), as well as an XML editor. And XSL files for certain processing when it is the best method for processing the XML before it hits my text editor.

           

          Not knowing your layout, a link to the modified XML file follows. It is modified for how I would envision using your XML in ID.

          https://www.dropbox.com/s/fgzrvov7wtvpodx/1SPV_nl-mod.xml

           

          Mike

          • 2. Re: Xml indents vs tabs in InDesign: whitespaces & linebreaks
            MW Design Level 4

            BTW, Steven, I meant to comment on another thing.

             

            I have no idea of your layout. So take this for what it's worth. You have the tag <feature> both within the <product-features> tag and within the <outside> tag. So when mapping to styles, these would naturually be formatted identically. If that's not how you want the ID paragraph mapping to be, then you are going to need these to be different.

             

            Mike

            • 3. Re: Xml indents vs tabs in InDesign: whitespaces & linebreaks
              @Six_Steven Level 1

              Hi hello Mike,  Thanks you for your prompt answer. This info is really helpful for me. When I link your XML file to my ID, the styles are matching perfect, and I can use "whitespace above" to for example my titles, so my ID lay-out looks the way I wanted. So just perfect!  But... when you use your text editor to do the post-XML, can it go in an automatically way (using for example a specific .XSL file) or do you make the changes manually? Do you have an example of a .XLS or a script which only strips the used tabs (indents) of the original XML file to the way you have provided me the modified XML?  So I don't have to do it manually as it's very time consuming. And furthermore, the original XML will be generated in the future from a database. So I really can't much influence the way the XML will be generated.  I really appreciate your help!  Steven

              • 4. Re: Xml indents vs tabs in InDesign: whitespaces & linebreaks
                MW Design Level 4

                Hello Steven,

                 

                I need to run out and will look back into the thread later tonight. In general, using anything but clean XML in ID is a crap-shoot. So I avoid it, especially using XSL or XSLT transformations during ID's import. All I want to hit ID is "clean" XML. It's predictable.

                 

                My general madness of working is not to alter/use XSLTs to accomplish something that a text editor is quicker at. So for the simple removal of leading tabs and or spaces before the first element, I use UltraEdit (my main text editor). This is especially helpful when, as I often get, several XML files for the same job (like separate sections for the same catalog that will be formatted differently). With a decent text editor, I can run the find/replaces on an entire folder. by the time I load them into my XML editor, most of the work has been done by the various F/Rs I run.

                 

                You are using a Mac, and I don't know what text editor(s) you have. In UltraEdit (both Windows/Mac versions available) to strip leading tabs and or spaces, the F/R string is

                 

                %^{^t+^}^{ +^}

                 

                Which equates to Begining of New Line, One or more Tabs, OR, one or more spaces.

                 

                In NotePad++ and using its regular expression (I think NP++ is only a Windows text editor) it is simply:

                 

                ^\s*

                 

                So again, Beginning of New Line, One or more Tabs or Spaces. In NP++, a tab is regarded as a space in the regular expressions.

                 

                Point of the above is that in a decent text editor for the Mac, there will be expressions you can use. I just don't know what they would be. So look into their help files. And if you post what text editor you have, someone here with the same text editor should be able to aid that aspect.

                 

                I do the same with pulling up elements into the same lines. I just need to look at the XML to find how they should be (which elements need to be included together, which need to start a new line, etc.) and do a quick F/R based upon those patterns. This takes longer to write than do, at least once you have a little practice. I recently formatted an XML with about the same degree of moving things around as your file (and removing the leading tabs/spaces). Once laid out in ID it was 187 pages. So the F/R functions are fast even on large XML files.

                 

                Take care, Mike

                • 5. Re: Xml indents vs tabs in InDesign: whitespaces & linebreaks
                  [Jongware] Most Valuable Participant

                  Six_Steven,

                   

                  The past three and some hours I've been struggling with the exact same issue. I want *some* of the white space imported (at the end of tags that translate to paragraphs) but not *all* (i.e., in my source about every major XML element ends with a hard return).

                   

                  This post

                   

                  http://forums.adobe.com/thread/510602

                   

                  gave me the crucial hint. Here is what I did: uncheck "Import white-space only elements". That way, you get everything on the same line. That's what I wanted  because I also have a simple XSLT file that basically says

                   

                  <xsl:template match="title"><title><xsl:apply-templates /><xsl:text>
                  </xsl:text></title></xsl:template>
                  

                   

                  for each of the elements where I do want a hard return at the end. Note the <xsl:text>..</xsl:text>, which contains only a single Return. This is "the" trick; all redundant white space is removed, and this one command inserts it where needed.

                   

                  (Edit) oh wait ... I needed "only one more thing" with this. And, naturally, this trick works three times but I cannot get it to work a fourth time.

                   

                  (Edit #2) Solved, after another 2 hours or so of teeth-grinding and hair-pulling.

                   

                  The trick seems to be:

                   

                  1. Do not use InDesign's own 'ignore whitespace'. It's worthless, in the sense it's an all-or-nothing option -- and you don't want either.

                   

                  2. Using XSLT, remove all whitespace inside all elements using

                   

                  <xsl:template match="text()"><xsl:value-of select="normalize-space()"/></xsl:template>

                   

                  3. .. and selectively maintain whitespace where you want it using

                   

                  <xsl:template match="saveInside//text()"><xsl:value-of select="."/></xsl:template>

                   

                  where saveInside is the name of the main XML tag in which you do want the original whitespace to remain.

                  • 6. Re: Xml indents vs tabs in InDesign: whitespaces & linebreaks
                    @Six_Steven Level 1

                    Hi hello,  Do you perhaps have a test case which I can download? So I can snif in to your codes of ID; XML and XLS please? This would definitely help!  Thanks in advance! Best Regards,