7 Replies Latest reply on Jan 15, 2011 2:25 PM by Haakenlid

    Apply Paragraph Styles based on text Language.

    MatsT

      I need to apply different paragraph styles to different parts of a document based on what language the text is in. What would be the best way to approach that? Specifically this has to do with right-to-left text direction for Arabic text. In some cases there can be English names inside these Arabic paragraphs that should still be left-to-right. It is up to the user to make sure that the language is correctly set for the different parts of the text.

       

      I suspect this is easiest to do via scripting, but a plugin solution is also very possible as we already have a plugin as part of the solution for doing other stuff.

        • 1. Re: Apply Paragraph Styles based on text Language.
          Haakenlid Level 3

          You can do that with search and replace. It can be scripted, if you want to do it many times.

           

          Make a paragraph style called for example "Arabic" and a character style called "Latin"

           

          Do a GREP search to change all Arabic text to the arabic paragraph style. Arabic is in the unicode range 0600–06FF.

           

          [\x{0600}-\x{06FF}]+

           

          Then change all text that is latin and has the "Arabic" paragraph style into latin character style.

           

          [\l\u]
          
          • 2. Re: Apply Paragraph Styles based on text Language.
            MatsT Level 1

            This seems to work really well, I was able to construct a 1-page script that changes all arabic text (or actually all paragraphs containing arabic text) to right-to-left direction but keeps the Roman characters turned the right way. It looks really weird when there is hyphenation in the middle of a roman word since the paragraph and character directions don't match there, but I'm fairly certain that the hyphenation provider will not hyphenate there in the end anyway.

             

            What I am not sure about is how reliable it is to just look at the unicode range to determine if text should be reversed. I don't have access to any production documents to verify this method, but the requirements states that the user needs to make sure that all text has the correct language set. I assume they are talking about the appliedLanguage property of the style currently used. There could hypothetically be situations where they wanted to turn non-arabic characters around also by using this property. Is there a way to iterate over a document and detect all places where a specific language is used?

            • 3. Re: Apply Paragraph Styles based on text Language.
              Haakenlid Level 3

              MatsT wrote:

               

              This seems to work really well, I was able to construct a 1-page script that changes all arabic text (or actually all paragraphs containing arabic text) to right-to-left direction but keeps the Roman characters turned the right way. It looks really weird when there is hyphenation in the middle of a roman word since the paragraph and character directions don't match there, but I'm fairly certain that the hyphenation provider will not hyphenate there in the end anyway.

              You can turn off hyphenation in the Latin character style.

               

              What I am not sure about is how reliable it is to just look at the unicode range to determine if text should be reversed. I don't have access to any production documents to verify this method, but the requirements states that the user needs to make sure that all text has the correct language set. I assume they are talking about the appliedLanguage property of the style currently used. There could hypothetically be situations where they wanted to turn non-arabic characters around also by using this property. Is there a way to iterate over a document and detect all places where a specific language is used?

              That is possible. You can iterate through Stories and TextStyleRanges and check their appliedLanguage.

               

              But by using the unicode GREP-method you can be certain that it would find Arabic text even when the appliedLanguage is not set correctly. Of course it only works with languages that use a unique alphabet, and can not distinguish between English and French, for instance.

              • 4. Re: Apply Paragraph Styles based on text Language.
                Harbs. Level 6

                Check out Multilingual Tools.

                 

                It has a "Language Styles" feature which does what you want: http://in-tools.com/plugin.php?p=26

                 

                Harbs

                • 5. Re: Apply Paragraph Styles based on text Language.
                  [Jongware]-9BC6tI Level 4

                  (FYI only)

                   

                  You can turn off hyphenation in the Latin character style.

                   

                  No you can't -- hyphenation is not a character attribute, it's a paragraph attribute.

                   

                  You can set "No Break" in a character style, but before suggesting that I'd take a long & hard look how much text there is.

                  • 6. Re: Apply Paragraph Styles based on text Language.
                    Harbs. Level 6

                    You can make a character style , and it will not be hyphenated...

                     

                    Harbs

                    • 7. Re: Apply Paragraph Styles based on text Language.
                      Haakenlid Level 3

                      [Jongware] wrote:

                       

                      (FYI only)

                       

                      You can turn off hyphenation in the Latin character style.

                       

                      No you can't -- hyphenation is not a character attribute, it's a paragraph attribute.

                       

                      You can set "No Break" in a character style, but before suggesting that I'd take a long & hard look how much text there is

                       

                      True. If you do not apply the character style with "No Break" to the whitespace between words it might be ok, though. But the "No Language" suggestion seems like a better idea.