5 Replies Latest reply on Mar 4, 2010 12:42 PM by ThisGuy-500

    extracting italics in XML

    ThisGuy-500 Level 1

      Hi All,

       

      My goal is to have italics tagged in my final XML export.  I'm wondering a couple of things:

       

      When an italic font (e.g., Frutiger Italic) is placed in InDesign, does InDesign mark this in the document  in any way as an Italic font, or is it just text?  That is, I'm looking for a flag or a hook that I can use to indicate in the final XML that this should be italicized text.

       

      What are my options?

       

      Any help would be appreciated...

       

      Thanks,

       

      ThisGuy

        • 1. Re: extracting italics in XML
          [Jongware] Most Valuable Participant
          .. does InDesign mark this in the document  in any way as an Italic font? ..

           

          Alas -- no.

          But a perfectly good reason for Adobe not to have done this, is that you can have a Light Italic, Regular Italic, Semibold Italic, Bold Italic, and Black Italic -- all in the same font! Eat that, Microsoft Word! (More in general: the entire Windows font setup is rigged from the start to only "allow" Bold, Italic, and Bold Italic font variants.)

           

          You can still use search-and-replace, to search for an italic variant of your current font and replace it with an XML tag -- CS4 allows a tag in the Replace field (but not in the Find field!). But you have to specify the exact name of your current italic variant(s).

           

          For Times New Roman and Minion, italics are simply called "Italic"; but for Helvetica, it's called "Oblique", and that's what you have to search for. Searching for "Helvetica Italic" does not work. For ITC Garamond Light, it's called "Light Italic"; and for Myriad Pro, well, myriads of its variants are called "weight-such-and-such Italic".

           

          A document has a property "fonts", which lists all used fonts, so perhaps you could loop over this and determine by the names in there which ones to tag; for example, if the name contains "Italic", "Oblique", "Slanted". Be aware that Frutiger and Univers use names like "46 Light Italic", so it may not be a simple one-word string.

           

          (It also leads to another interesting point: if you encounter a "Bold Italic", how do you know if it's been used as a simple local attribute "Italic" in an otherwide Bold text -- such as a heading --, or if it's actually been used to mark Bold Italic text inside regular text? In the latter case, you would need an XML element "Bold Italic" , in the former just plain "Italic". Perhaps you ought to check the default font of the paragraph style ...)

          • 2. Re: extracting italics in XML
            ThisGuy-500 Level 1

            Jongware,

             

            Right...the benefit of font variations working well is the bane of trying to create a good extract.  I may have to reduce the number of fonts to one simple Italic style and conduct a find and replace operation. It just takes more time and effort.  I'll post back here if I come up with a brilliant work-around.

             

            Thanks as always for your thorough responses.

             

            -ThisGuy

            • 3. Re: extracting italics in XML
              [Jongware] Most Valuable Participant
              function(){return A.apply(null,[this].concat($A(arguments)))}

              ..  to one simple Italic style ...

               

              Oh wait! I forgot all about character styles!

               

              Putting all your different italics into character styles might be a good idea. If you only and exclusively use character styles (which is a good habit!), you can check their names, and devise a system of your own -- for example, the style defining Univers 55 Oblique could be simply named "Italic (Univers)"; for ITC Stone Serif Semibold Italic, it would be "Italic (Stone Serif Semibold)" -- and so on.

              • 4. Re: extracting italics in XML
                ThisGuy-500 Level 1

                Jongware,

                 

                I wish.  No, the files that I'm receiving are not that clean...perhaps in the future I can persuade the team that's building the files.

                 

                I think I'll use the find and replace and change to local styling.  At the moment this is the shortest route with what I'm starting with.

                 

                Tom

                • 5. Re: extracting italics in XML
                  [Jongware] Most Valuable Participant

                  Easy enough to get it started. This real quick first go will find all "Italic" text (and -- surprise! Even the "Oblique" of Helvetica! But not "Futura Medium Italic", so it has to be some internal trick for Helvetica...).

                   

                  I called the markup tag which is inserted "i", short and clear, but remember to Insert Your Name There.

                   

                  app.findTextPreferences = null;
                  app.changeTextPreferences = null;
                  
                  app.findTextPreferences.fontStyle = "Italic";
                  app.changeTextPreferences.markupTag = "i";
                  
                  app.activeDocument.changeText();