10 Replies Latest reply on Jul 1, 2012 12:18 PM by samar02

    GREP on a list

    samar02 Level 1

      Hi,

       

      I am working on a book document whose files contain some words of the following nature:

       

      ausstossen

      Hohlmass

      süss

      weiss

      weisse

      ...

       

      Each word has one or more strings of "ss", and one or more of these should be replaced by "ß". So the five lines above should look like this:

       

      ausstoßen

      Hohlmaß

      süß

      weiß

      weiße

      ...

       

      In order to accomplish this with Find/Change using GREP, I have prepared a text file with all words containing "ss". I then worked my way through this list, setting parentheses around the parts that do NOT contain an "ss" string that needs to be changed. So the list now looks like this:

       

      (aussto)ss(en)

      (Hohlma)ss

      (sü)ss

      (wei)ss

      (wei)ss(e)

      ...

       

      The idea is to use something like this:

       

      (aussto)ss(en)|(Hohlma)ss|(sü)ss|(wei)ss|(wei)ss(e)

       

      and then to use $1 and $2 in combination with ß. But how exactly do it so that the correct "ss" strings become ß and the rest of the text remains there?

       

      CS5

        • 1. Re: GREP on a list
          Alec Molloy Employee Moderator

          I think this could be of some use: http://indesignsecrets.com/findbetween-a-useful-grep-string.php

           

          Viel glück.

          • 2. Re: GREP on a list
            samar02 Level 1

            Thank you, Alec, this helps. So the solution is to use Positive Lookbehind and Positive Lookahead.

             

            And still I'm not quite there yet. What I have come up with so far is

             

            Find what

            (?<=\b(aussto|Hohlma|sü|wei|wei))ss(?=(en|e)\b)

             

            Change to

            ß

             

            This lets me find "ausstossen" and change it to "ausstoßen", which is fine, but there are no more matches. Why?

             

            Somehow I need to find out how to deal with the second part of this GREP. The string

             

            (?=(en|e)\b)

             

            should match the first ("aussto") and the fifth (the second occurrence of "wei") part of the first string, but obviously it doesn't. Any idea?

            • 3. Re: GREP on a list
              [Jongware] Most Valuable Participant

              Yeah well GREP does not work like that.

               

              You cannot have different length strings in a lookbehind, only the first or the longest or the shortest or something will be found (in your case it seems it was the first).

               

              You can also not input a list of (abc)(def) where a must be paired with d, b with e, and c with f.

               

              If you are really, really really (ABSOLUTELY etc.) determined to do all of this in one single find-and-replace, use this:

               

              (?<=aussto)ss(?=en)|(?<=Hohlma)ss|(?<=sü)ss|(?<=wei)ss|(?<=wei)ss(?=e)

               

              ... and if this is your entire list, it ought to work. But there are limits to the length of a GREP and to the number of "operators" -- separate parts that should get processed --, and I'm pretty sure you cannot add the entire Duden (*) list of re-spelled ss/ß words.

               

              Use FindChangeByList instead! You don't even have to use GREP, you can simply change each word, one at a time, with the appropriate one. After changing the findchangebylist data file, all it takes is one double-click to do your changes.

               

               

              (*) That's German for "dudes"

              • 4. Re: GREP on a list
                samar02 Level 1

                Thank you, Jongware, for your reply. So I'm trying to come to grips with this script, then, leaving GREP behind. And yes, my entire list is rather long (+1000 entries).

                 

                The problem is: I seem to understand (most of) what is written in "FindChangeList.txt" (in the FindChangeSupport folder), but not what "FindChangeByList.jsx" says in the second part.

                 

                In other words: It is no problem to create a tab-delimited text file of the following nature:

                 

                text{findWhat:"ausstossen"}{changeTo:"ausstoßen"}{includeFootnotes:false, includeMasterPages:false, includeHiddenLayers:false, wholeWord:true}Find and replace a word

                 

                But what should come before these entries, what after?

                 

                I know this should belong to the Scripts forum, but since it all began here ... Perhaps the answer to this question is simple?

                • 5. Re: GREP on a list
                  [Jongware] Most Valuable Participant

                  > But what should come before these entries, what after?

                   

                  Uh. More entries? Each replacement comes on a line of its own. I've never used this variant of FindChangeByList but I think it's not even necessary to include the options (3rd set of commands) and comment (final set of commands).

                   

                  Did you already prepare your entire list in the parenthesized format you planned to use before? If so, I think I can make up a quick script to use that instead, sparing you the trouble of (again) transposing everything into the format required by FindChangeByList.

                  • 6. Re: GREP on a list
                    samar02 Level 1

                    Jongware, this would be very kind!

                     

                    I will have two lists. The shorter one is completed and of the following nature (no parentheses anymore):

                     

                    ...

                    ausstossen     ausstoßen

                    Hohlmass     Hohlmaß

                    süss     süß

                    weiss     weiß

                    weisse     weiße

                    ...

                     

                    So each entry/line has three units:

                    1. the word that should be found

                    2. a tab

                    3. the replacement

                     

                    The following parameters should be on (and nothing else), using a "Text" Find what:

                    Search: All Documents

                    Case Sensitive

                    Whole Word

                     

                    I must confess I have no real scripting experience so your help is greatly appreciated.

                     

                    I am using CS5 ME.

                    • 7. Re: GREP on a list
                      [Jongware] Most Valuable Participant

                      No trouble at all, stuff like this takes me mere minutes

                       

                      I am using CS5 ME.

                       

                      I'm using CS4 Roman, but I'm quite sure this is no problem.

                       

                      This is a Javascript, so save as 'changeDoubleEs.jsx'. Make sure to change the file name in the script at line 1a. to the one you are using!

                       

                      It's all pretty straightforward, except maybe the concatenated series of 'replace'. These are *simple* GREP changes -- they have nothing to do with InDesign's GREP and only operate on plain text strings inside Javascript itself. They remove errant spaces from the start and end of each line, as well as before and after tabs. I know from personal experience it's *always* possible a space crept in somewhere, throwing off the actual replace operation.

                       

                      // 1. Load text file into memory
                                // a. put your file name here
                      f = new File("~/Documents/doublees.txt");
                                // b. attempt to open it
                      if (f.open('r') == false) { alert ('no file?'); exit() }
                                // c. read!
                      text = f.read();
                                // d. close file
                      f.close();
                      
                      // 2. Clean up text strings
                                // a. replace carriage returns with newline
                                //    not sure if this is necessary but YNK
                      text = text.replace(/\r/g,'\n');
                                // b. remove possible errant spaces
                      text = text.replace(/\t +/g, '\t').replace(/ +\t/g, '\t').replace(/\t+/g, '\t');
                                // c. remove possible errant double returns
                      text = text.replace(/\n\n+/g,'\n');
                                // d. remove possible errant double tabs
                      text = text.replace(/\t\t+/g,'\t');
                                // e. split text into an array per line
                      text = text.split('\n');
                                // f. split each line per tab
                      for (i=0; i<text.length; i++)
                      {
                                // g. remove possible lingering spaces at start and end
                                text[i] = text[i].replace(/^ +/,'').replace(/ +$/,'');
                                // h. split!
                                text[i] = text[i].split('\t');
                      }
                      // Uncomment this line to see the list pop up
                      // alert (text.join('\r'));
                      
                      // 3. Set Find/Change parameters
                      app.findTextPreferences = null;
                      app.changeTextPreferences = null;
                      app.findChangeTextOptions.caseSensitive = true;
                      app.findChangeTextOptions.wholeWord = true;
                      
                      // 4. Loop over text strings
                      for (i=0; i<text.length; i++)
                      {
                                // a. only process if there are two elements
                                if (text[i].length == 2)
                                {
                                          // b. element [0] is 'find'
                                          app.findTextPreferences.findWhat = text[i][0];
                                          // c. element [1] is 'replace'
                                          app.changeTextPreferences.changeTo = text[i][1];
                                          // d. change!
                                          app.changeText();
                                }
                      }
                      
                      • 8. Re: GREP on a list
                        samar02 Level 1

                        Thank you again for your help!

                         

                        Now I have four things:

                         

                        (1) an ID book file with twenty *.INDD files, most of them containing, among many others, some words spelt with double-s (ss) instead of Eszett (ß);

                        (2) a tab-delimited *.TXT file with an exhaustive list of words that need changing ("OldwordTabNewword") in the files mentioned under (1);

                        (3) your JavaScript;

                        ... and ...

                        (4) no idea as to how these three things can be made to work! And still, I rely on your assertion that I "can simply change each word, one at a time, with the appropriate one. After changing the findchangebylist data file, all it takes is one double-click to do your changes."

                         

                        Okay, since I have already put a lot of work into compiling my file (2) above, and since I soon seem to be crossing the finishing line, I am determined to get this done. So in your script file, I have included the file path (and file name) of my file (2) under 1a, moved your file to the Scripts Panel, and started it from there. And now I get the Script Alert "no file?"

                         

                        Nearly there, I hope ...

                        • 9. Re: GREP on a list
                          [Jongware] Most Valuable Participant

                          The "No file" alert means that the script was unable to open the file. Are you *sure* you have copied the full path and filename correctly?

                           

                          Without being able to look over your shoulder, let me guess ;-) are you on Windows? If so, does the path you wntered contain backslashes \ ? The backslash is a special character inside a Javascript, so if you used single ones, it may be the cause of the "no file" error. And (still, If So) you can solve it in two ways: either *double* every backslash, or replace them with the forward slash / -- that may seem odd (if and only if you *are* on Windows) but Adobe got that covered for you.

                           

                          -- which makes me think of another possible Windows pitfall. If on Windows, do you have "Hide common file extensions" switched on? Your file name may be something else than you might be thinking.

                           

                          If you are *not* using Windows, or the above doesn't make it jump to life, try something else. It's still quite likely something else in that path is incorrect, but you can actually do without it! Copy your data file in your scripts folder as well, and then all you need in the script is the file *name*.

                           

                          (And if *that* does not work ... rename the data file to something *extremely* simple, say "a", just to prevent typos? Make a screenshot of the file folder, showing its name and post it in here ...?)

                          • 10. Re: GREP on a list
                            samar02 Level 1

                            You were right with the backslashes (I'm on Windows 7). Just escaping them resolved the problem, and after a double-click, everything got changed according to plan. Excellent, what a time saver! Fantastic! And here it is, on my PC, and I can revert to this script again, whenever necessary.

                             

                            Thank you, Jongware!