5 Replies Latest reply on Dec 29, 2012 10:18 PM by John Hawkinson

    How can I combine XML info into HTML/ePub export?

    SLTyPete Level 1

      I'm creating a workflow that exports large InDesign documents to a complex online eDocument system.

       

      The documents contain highly formatted chunks of content with no automatic semantic (meaning/purpose) markup.

       

      It looks like I can add structure tags through a variety of means, to provide the semantic meaning (styles->tags, scripts, grep styles, plus some manual effort).

       

      My problem is: how do I inject the structure/tag info into the output?

       

      I've tried exporting according to XML (tag) structure. That appears to successfully define the sequence of output, but none of the actual tag structure shows up in the output (in either EPUB or HTML output).

       

      What would be ideal would be if I could export to HTML or ePub and surround content chunks with <div> tags marked up with the tag names.

       

      But right now, I'd be happy for any export format that gives me the HTML content, and XML/tagged semantics.

       

      Anybody have any ideas?

       

      Thanks so much,

      Pete

        • 1. Re: How can I combine XML info into HTML/ePub export?
          John Hawkinson Level 5

          The intended use of this feature (XML tagging) is for InDesign's XML export, not EPUB or HTML export.

          So...you are not in a good situation.

           

          If you have a sufficiently advanced workflow, you may be able to translate the XML export (or also potentially InDesign's IDML export, but that is much more complicated) into what you want from EPUB or HTML.

          • 2. Re: How can I combine XML info into HTML/ePub export?
            SLTyPete Level 1

            Hmmm...

             

            XML is *not* a requirement. Just finding a way to represent the meaning of the content.

             

            The real challenge (in this case) reduces to grouping content into a hierarchy, much like the levels of a TOC.

             

            I can easily define a particular HTML tag for a given paragraph style (eg H1).

             

            What I can't do so easily is put a <DIV> around the entire section represented by that header. In a sense, I'd love a way to put nested <DIV>s around each H1 level "chunk" of content, and each H2 level etc.

             

            IOW (using semi-real HTML)... instead of just

             

            <H1>Top header </H1>

              <H2>Section header </H2>

                   some content

              <H2>Section header </H2>

                   some content

            <H1>Top header</H1>

               ...

             

            I want

            <DIV class=TopSection>

            <H1>Top header </H1>

              <DIV class=SectionLev2>

              <H2>Section header </H2>

                   some content

              </DIV>

              <DIV class=SectionLev2>

              <H2>Section header </H2>

                   some content

               </DIV>

            </DIV>

            <DIV class=TopSection>

            <H1>Top header</H1>

               ...

            </DIV>

             

            It seems this should not be so hard... but apparently it is?!

            • 3. Re: How can I combine XML info into HTML/ePub export?
              John Hawkinson Level 5

              Well, I'm not the right person to talk to for EPUB and HTML exports, so don't pay too much attention to me, but …


              InDesign's EPUB and HTML export functions just do not give you a lot of direct control over the code that implements the markup. If you are expecting that, you are going to bang your head against the wall and be sad, over and over and over again.

               

              What version of ID are you using? There have been a lot of changes (though I think none of them particularly help you).

               

              If I were you, I think I would probably postprocess the output file. Convert all <h1>foo</h1> to <div class="Topsection"><h1>foo</h1></div> after InDesign outputs whatever file format you want. This has the advantage of not complicating your InDesign user experience (and thus reducing errors), and actually working.

               

              If you want to use XML export, though, you can achieve what you want. Just make a div tag and in the Structure pane, after applying it, Control-click on it and choose New Attribute and set class to Topsection. Then within make an h1 tag. This will let you apply all the markup in InDesign, but it sounds incredibly painful.

               

              Though maybe your content is coming from somewhere else?

              • 4. Re: How can I combine XML info into HTML/ePub export?
                SLTyPete Level 1

                Thanks for hanging in there with me, John

                 

                Your postprocess example unfortunately doesn't work: it simply puts a div around the H1 itself rather than the H1 plus everything up to the next H1.

                 

                I'm beginning to think I'll have to use XML export, simply because HTML/ePub export does not support hierarchy at all. That's sad.

                 

                100% of the content is already in InDesign (we're talking tens of thousands of pages! A rather huge content library.)

                 

                As you can guess, this is NOT going to happen manually, not if I can help it. Not enough monkeys to get the job done.

                 

                But... I can script it. I was hoping to avoid XML for a lot of reasons, not least because HTML/ePub provides much better formatting. But there are workarounds... just a lot of Work to get Around the hassles...

                • 5. Re: How can I combine XML info into HTML/ePub export?
                  John Hawkinson Level 5

                  Your postprocess example unfortunately doesn't work: it simply puts a div around the H1 itself rather than the H1 plus everything up to the next H1.

                  Sorry, yes, I got that wrong.

                  But it doesn't invalidate the idea of postprocessing. It just makes it (a little?) more complicated.

                   

                  Good luck.