10 Replies Latest reply on Jun 28, 2011 9:07 AM by David W. Goodrich

    List of languages used in an ID file?

    David W. Goodrich Level 3

      Is it possible to find out what language attributes have been applied to text in an InDesign file directly, that is, other than by using ID's searching one by one through the long list of languages available?  Or in Word or Open Office Writer?

       

      I frequently receive files containing text in multiple languages, including Japanese, Chinese and occasionally Korean.  Most of the text is English, for which I want ID to hyphenate according to its rules for English: USA.  But sometimes long stretches of English text arrive with the language attribute set to a member of the CJK family, effectively preventing hyphenation.  For hyphenation this is no big deal: find anything in the alphabetic font and change the language attribute.

       

      But often there are other oddities.  Recently, a file arrived where a string of romanized Japanese had the attribute for German applied, with the German attribute bleeding over for half a line of the following English-language text.  Rather than search for all possible languages, I'd like to get a list of the language attributes actually in use so I can check just those.  I want to assure my customers (who, judging by the files they submit, do not understand language attributes anyway) that they don't have to worry about tagging the languages properly because I'll do it for them.  Rorohiko's FrameReporter recently announced it can do this for frames, but I'd like to be able to do this for stories running well over 100 pages/frames.

       

      Thanks in advance,

       

      David

        • 1. Re: List of languages used in an ID file?
          John Hawkinson Level 5

          You should be able to straightforwardly extract this information by exporting the InDesign file to IDML and searching the IDML file for the XML string that indicates a language. It should be human-readable. You will need to unzip the IDML file first, it is a .ZIP -format file.

          1 person found this helpful
          • 2. Re: List of languages used in an ID file?
            mindsteam Level 1

            Hi David,

             

            Our product LanguageLamp Pro is designed for working with documents containing multiple languages.

             

            www.mindsteam.com/products/languagelamppro

             

            Here is another thread with screenshots.

             

            http://forums.adobe.com/message/2606841#2606841

             

            Best regards,

            Heath Horton

            Mindsteam Software

            1 person found this helpful
            • 3. Re: List of languages used in an ID file?
              Eugene Tyson Adobe Community Professional & MVP

              There's a link on the left of this site

               


              http://jsid.blogspot.com/2005/10/text-styles-reporter.html

               

              It's called Text Style Reporter

               

               

              Download the zip file and extract the contents.

               

               

              Copy the files to the Script folder in InDesign

               

              Run the script and select only Advanced Character Formats

               

              and this should show you what language is used in the Style.

              • 4. Re: List of languages used in an ID file?
                David W. Goodrich Level 3

                Thank you all.  I had tried exporting to tagged text and IDML, but was intimidated by the number of instances of "cLanguage" (tagged text) and "AppliedLanguage" (IDML) -- Notepad++ conveniently provides counts of instances, well over 600 in my test file.  It took me longer than it should have to realize I could replace "English: USA" with a token to reduce instances to a more manageable number -- and I found more oddities than I expected.  Some were explicable but some seemed totally random.  I suppose that is what what happens when multiple authors and editors work on a file on various continents, with chunks of text (bibliographical references) pulled off the WWW.

                 

                I also considered changing the color attributes throughout, and then going back to black for English, leaving the rest highlighted.  A quick glance at LanguageLamp Pro suggests its Preflight feature can automate this.  I've only just begun to experiment with IDCS5.5, but expect to start using it for real jobs before long, and that may be the way to go.  (I'm also looking forward to using Harbs' World Tools Pro.)

                 

                David

                • 5. Re: List of languages used in an ID file?
                  RorohikoKris Level 2

                  Hi David,

                   

                  Our FrameReporter actually reports on the languages used in the whole story 'running through' any particular frame. So as long as your stories are really multi-frame stories, you'd simply click on any frame that contains part of any story and see what the languages are...

                   

                  Cheers,

                   

                  Kris

                  • 6. Re: List of languages used in an ID file?
                    David W. Goodrich Level 3

                    Kris,

                    Guess I jumped to the wrong conclusion -- sorry about that.  This sounds promising.  Thanls,

                    David

                    • 7. Re: List of languages used in an ID file?
                      TᴀW Adobe Community Professional & MVP

                      This script: http://http://www.freelancebookdesign.com/?page_id=390 can do that. Select only the 5th option (language), and it will go through the document creating characters styles wherever it finds text in a paragraph that uses a different language from the majority of text in that paragraph. There's a link to the demo version on the page.

                       

                      The character styles will be named with the language used: e.g. "Language: Korean", etc.

                       

                      Does that help?

                      Ariel

                      www.FreelanceBookDesign.com

                      • 8. Re: List of languages used in an ID file?
                        David W. Goodrich Level 3

                        Arïel,

                         

                        Your script for generating character styles from local overrides looks very useful in general, and of course having Language as one of the many selectable attributes could be especially helpful in multi-lingual settings: I can see making it part of my process for bringing files into ID.  In the case of my test file, however, I'd be wary of losing the half-dozen character styles I've already applied as ID restricts characters to having just one style.  I'll certainly keep this in mind.

                         

                        David

                        • 9. Re: List of languages used in an ID file?
                          TᴀW Adobe Community Professional & MVP

                          Hi David,

                           

                          I think it is useful. Do you think it would be better that if the script

                          finds a character style applied already, that is shouldn't apply it's

                          own? In other words, it shouldn't override existing character styles? I

                          think that makes sense. It's an easy adjustment to make.

                           

                          Thanks,

                          Ariel

                          • 10. Re: List of languages used in an ID file?
                            David W. Goodrich Level 3

                            Arïel,

                             

                            My guess is that so long as ID allows just one style per character, the usual preference would be to have your script override existing character styles.  The files I receive come with all kinds of odd styles, as authors or editors paste in snippets from old papers or the WWW, bringing in styles like Strong or Emphasis.  I must zap those -- easy enough by hand because ID marks imports on the menu.  Were I using a script like yours I think I'd expect it to do the zapping for me.  Next, while still at that early stage, I could use GREP to apply my own character styles to CJK strings based on the codes, allowing me find all the mis-placed CJK and other language attributes via the styles generated by your script.  Definitely workable for bringing files into ID.

                             

                            My test file is a special case, however: I failed to notice language attributes had been applied incorrectly until eyeballing hyphenation late in the process.  I suppose I was hoping for a sort of Swatches panel for languages.  FrameReporter offers a list of languages used in a story, while LanguageLamp takes a different approach, offering a visual representation and (with Pro) pre-flighting.

                             

                            That particular job is out the door, but I now have lots to think about for next time.  Thank you all again,

                             

                            David