14 Replies Latest reply: Apr 23, 2012 8:51 AM by Trevorׅ RSS

    Change mutiple finds - Erase all Diaritics

    Trevorׅ Community Member

      Hi all,

       

      I made a script to remove a range of diacritics from selected text (The Squiggly bits at the top and bottom of letters) which works but I thought could be made more efficient by using findText().

      My question is: Can one search for a range of unicodes (or for that matter a list of words like mom, mum, mommy, mummy, mam etc.) so that they can be deleted or changed to the same thing, (in the case of the word list to mother). without have to loop through every character in the selection?

       

      my script is:


      #target "InDesign"
      app.doScript("main()", ScriptLanguage.javascript, undefined, UndoModes.FAST_ENTIRE_SCRIPT, "Remove Vowels");
      
      function main()
      {
       var cc, t, w, x, d, q;
      cc=0
      t = app.selection[0];
      w = new Array;
      x = new Array;
      for(d=0; d<t.characters.length-1; d++){
          w[d]=t.characters[d];
      try{  
         
        myCharacter= w[d];
          myChar=myCharacter.contents;
          unicode=myChar.charCodeAt (0);
       // Unicode range to remove
       if  (((unicode > (0x0590) && unicode  < (0x05BE))||
              (unicode >  (0x05C0) && unicode  < (0x05C3))|| 
              (unicode >  (0x05C3) && unicode  < (0x05C6)))||
              unicode == (0x05BF)||
              unicode == (0x05C7))
              {x[cc]=d; cc++}
          else 
          
      }
      catch (noUnicode) {};
      }    
      
      q=cc-1;
      while (q>-1){
          try {
      w[x[q]].remove();        
              }
          catch (error) {};
          q--;
          }}
      

       

      I would also like to know if one can change a unicode range or word list using the regular indesign find / change interface?

       

      Thanks in advance.

       

      Trevor

        • 1. Re: Change mutiple finds - Erase all Diaritics
          Trevorׅ Community Member

          I found how to use the main indesign interface for finding a list of words.

          Search for GREP then go to Match and then Or.

          Should be easy to find out how to script that but I still am doubtfull about the unicode ranges

          ScreenShot005.png

          • 2. Re: Change mutiple finds - Erase all Diaritics
            pkahrel CommunityMVP

            Find unicode ranges:

             

            [\x{0590}-\x{05BE}]  (find range 0590-05BE)

            [\x{0590}-\x{05BE}\x{05C0}-\x{05C6}]  (find ranges 0590-05BE and 05C0-05C6)

             

            Replace items from a list with a single item:

             

            Find what: \b(mom|mum|mommy|mummy|mam)\b

            Replace with mother

             

            You need to do this in the GREP tab.

             

            Peter

            • 3. Re: Change mutiple finds - Erase all Diaritics
              Trevorׅ Community Member

              Peter

               

              Brilliant, I was at least 90% sure that you would the one to answer.

               

              In script it goes

               

              var mySelection = app.selection[0];
              app.findGrepPreferences = NothingEnum.nothing;
              app.changeGrepPreferences = NothingEnum.nothing;
              app.findChangeGrepOptions.includeFootnotes = false;
              app.findChangeGrepOptions.includeHiddenLayers = false;
              app.findChangeGrepOptions.includeLockedLayersForFind = false;
              app.findChangeGrepOptions.includeLockedStoriesForFind = false;
              app.findChangeGrepOptions.includeMasterPages = false;
              //Unicode Range
              app.findGrepPreferences.findWhat = "[<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>]";
              app.changeGrepPreferences.changeTo = NothingEnum.nothing;
              mySelection.changeGrep();
              app.findGrepPreferences = NothingEnum.nothing;
              app.changeGrepPreferences = NothingEnum.nothing;
              

               

              I found the basic <0591> format in this forum by you from 5 years back http://21.adobe-scripting-indesign.overzone.net/find-change-using-unicode-t1610.html

              and your answer above gave away the missing details.

               

              I guess it would be a very good idea to buy this book http://shop.oreilly.com/product/9780596156015.do

               

              This script is countless time quicker than the above one.

               

              Thanks a million.

              • 4. Re: Change mutiple finds - Erase all Diaritics
                pkahrel CommunityMVP

                Trevor,

                 

                The <0000> format is replaced with the corresponding character in the Find what field, which often makes it barely readable. The \x{0000} format is not replaced, and I find that easier. As to that book, you guess right!

                 

                Peter

                • 5. Re: Change mutiple finds - Erase all Diaritics
                  Trevorׅ Community Member

                  Thanks Peter,

                   

                  When I wrote about using the <0000> format I was referring to in scripting and not in the grep tab.

                  I think you must of seen this post in email form and missed the lines of scripting

                   

                  In scripting

                  these three options work

                  app.findGrepPreferences.findWhat =  "[<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>]";

                  app.findGrepPreferences.findWhat = "[\u0591-\u05BCx\u05c2x\u05C4x\u05C7x\u05BF]";

                  app.findGrepPreferences.findWhat = "[֑-ֿxׁxׂxׄxׅxׇ]";

                   

                  This does not

                   

                  app.findGrepPreferences.findWhat = "[\x{0591}-\x{05BD}x\x{05C1}x\x{05C2}x\x{05C4}x\x{05C5}x\x{05C7}x\x{05BF}]";

                   

                  Don't know why.

                   

                  On the grep tab

                  the unlucky option is [\u0591-\u05BCx\u05c2x\u05C4x\u05C7x\u05BF] which does not work properly (in fact hardly works at all!).

                   

                  [\x{0591}-\x{05BD}x\x{05C1}x\x{05C2}x\x{05C4}x\x{05C5}x\x{05C7}x\x{05BF}] scores top for readability

                  and both [<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>] and [֑-ֿxׁxׂxׄxׅxׇ] (which as you the [<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>] becomes [֑-ֿxׁxׂxׄxׅxׇ]) work  but have the readability issue on the one hand and on the  other hand are easier to enter if you can read them.

                   

                  Anyway I'm quite please that from not knowing any way to use the grep tab or the script app.findGrepPreferences.findWhat = method (beside one diaritic at a time!), now I know 3 for each !

                   

                  Regards, Trevor

                   

                  P.s. Plan to get the book later in the day!

                  • 6. Re: Change mutiple finds - Erase all Diaritics
                    [Jongware] CommunityMVP

                    If you write out GREP expressions in Javascript to use with findGrep/changeGrep, you must take into account that backslashes inside a Javascript string needs escaping. Therefore you need to double each of them:

                     

                    \\x{0591}

                     

                    (etc.)

                     

                    The "exceptions" -- there are always some -- are \r, \t, and \n, but in fact those aren't as special as they seem. They get translated into literal character codes for Carriage Return, Tab, and Line Feed, and as it happens, those can be fed as well into the findWhat string, even though you cannot type them in the interface (after inserting them with your script, sometimes you can see the GREP find field struggle with trying to display the string).

                     

                    You could try if the special Unicode GREP group "\p{Mn}" finds all of the non-spacing markers you want to get rid of -- I think this class of commands is mentioned in Peter's book as well.

                    • 7. Re: Change mutiple finds - Erase all Diaritics
                      pkahrel CommunityMVP

                      Ah, yes, the unicode properties \p{ }. They're quite useful. Two of my favourites are \p{Zs} 'all spaces except tab and return' and \{Pd} 'all hyphens and dashes'. And yes, all 37 of them described in the book.

                       

                      Peter

                      • 8. Re: Change mutiple finds - Erase all Diaritics
                        Trevorׅ Community Member

                        Jongware

                         

                        I should have been able to figure out the escaping of the \,

                         

                        Oh well better luck next time.

                         

                        So now I have another 2 methods for the scripting:

                        app.findGrepPreferences.findWhat = "[\\x{0591}-\\x{05BD}x\\x{05C1}x\\x{05C2}x\\x{05C4}x\\x{05C5}x\\x{05C7}x\\x{05BF}]";

                        and

                        app.findGrepPreferences.findWhat = "\\p{Mn}";

                         

                        I did try the \p{Mn} method in the script but it didn't work because I didn't escape it.

                         

                        and in the grep tab another one

                        \p{Mn}

                         

                        Well sundenly overwhelmed with choice the winning script is:

                         

                        var mySelection = app.selection[0];
                        app.findGrepPreferences = NothingEnum.nothing;
                        app.changeGrepPreferences = NothingEnum.nothing;
                        //Unicode Range
                        app.findGrepPreferences.findWhat = "\\p{Mn}";
                        app.changeGrepPreferences.changeTo = NothingEnum.nothing;
                        mySelection.changeGrep();
                        app.findGrepPreferences = NothingEnum.nothing;
                        app.changeGrepPreferences = NothingEnum.nothing;
                        

                         

                        Short and sweet (and quick).

                         

                        Peter

                         

                        I kept my word about getting the book and you can test me on the 37 \p{} methods tomorow

                        • 9. Re: Change mutiple finds - Erase all Diaritics
                          pkahrel CommunityMVP

                          If it's brevity you're after:

                           

                          app.findGrepPreferences = app.changeGrepPreferences = null;
                          //Unicode Range
                          app.findGrepPreferences.findWhat = "\\p{Mn}";
                          app.selection[0].changeGrep();
                          app.findGrepPreferences = app.changeGrepPreferences = null;
                          

                           

                          Peter

                          • 10. Re: Change mutiple finds - Erase all Diaritics
                            Trevorׅ Community Member

                            T Y

                             

                            I think by comparing my original and this finial script, one can see an excellent example of how well and how poorly a script can be made.

                             

                            Well I'm happy I saw there was a problem and didn't have that "Very British attitude Wink" and did complain, and something did change!

                             

                            (see towards the bottom of http://21.adobe-scripting-indesign.overzone.net/find-change-using-unicode-t1610.html)

                            • 11. Re: Change mutiple finds - Erase all Diaritics
                              Trevorׅ Community Member

                              Peter

                               

                              I forgot to mention....

                               

                              I like the book, I see the basic grep | or function is right on the very first page after the contents although I got a little scared of the python that spat on the second page.

                               

                              Trevor

                              • 12. Re: Change mutiple finds - Erase all Diaritics
                                absqua Community Member

                                Peter Kahrel wrote:

                                 

                                Ah, yes, the unicode properties \p{ }. They're quite useful. Two of my favourites are \p{Zs} 'all spaces except tab and return' and \{Pd} 'all hyphens and dashes'. And yes, all 37 of them described in the book.

                                 

                                Wow. How have I gone this long without knowing about these? Guess I should have read your book. Here's another resource.

                                 

                                Jeff

                                • 13. Re: Change mutiple finds - Erase all Diaritics
                                  pkahrel CommunityMVP

                                  It's never too late, Jeff ! That source you mention is indeed very good. It's where I first learnt grep, back in CS2 days. It's not InDesign-specific though, so not everything discussed there applies to InDesign. Good site nevertheless. Those codes are illustrated with an InDesign document here: http://www.kahrel.plus.com/indesign/grep_mapper.html

                                   

                                  Peter

                                  • 14. Re: Change mutiple finds - Erase all Diaritics
                                    Trevorׅ Community Member

                                    Nice resource Jeff, there's also a nice grep mapper pdf table on Peter's site but for less (just) than $10 I'm sure Peter would second me that it's worth going for the book!!

                                     

                                    (Just saw that Peter beat me to it with the mapper)