5 Replies Latest reply on Feb 24, 2014 8:11 AM by Fritschy

    GREP search for all characters


      (Indesign CS6, Windows 7)

      For a book I'm typesetting, the author has marked text that must be indented like this:








      (Of course, in an ideal world, the publisher would have instructed the author to simply use a Word function for this.)


      I've made a paragraph style for indented text. I wanted to replace this in one stroke, with the following GREP search:


      (\[indent\])([.|\s|\r]+)(\[/indent\])     The middle part meaning one or more (+ ) of any character (.) or space or paragraph return. Replace it by:


      $2 and the paragraph style          (getting rid of the markers in the same time).


      Nothing was found. I changed it to (\[indent\])([\w|\s|\r]+) (without the end marker [/indent], to see what would happen), and now he found something - but stopping at the first punctuation mark (not reaching the end marker). So I tried filling in punctuation marks: (\[indent\])([\w|\.|\,|:|\s|\r]+) and every time more was found (by the way, \. here means a full stop). So this more or less works.

      The problem is, it's a technical manual and most of the equations etc. are in text that should be indented, so there are a lot of characters that are neither letter or figure. And of course you can't call this a neat solution.


      This is the kind of search-and-replace string that I would use on many other projects too. Does anyone know a better solution? Now 'any character' doesn't seem to work, is there a code for 'any character not being a letter or a figure'? Thanks in advance.

        • 1. Re: GREP search for all characters
          Peter Spier Most Valuable Participant (Moderator)

          try (?s)(\[indent\]\r?)(.+?)(\[/indent\]\r?)


          The (?s) turns on single line mode so it will find across multiple paragraphs, and the (.+?) is the shortes match between the the tags. I've included the returns after the tags, if there are any, based on your example that the tags are separate paragraphs. If they are at the end of the paragraph that you want to keep, remove the \r? to prevent losing the paragraph breaks.

          • 2. Re: GREP search for all characters
            Fritschy Level 1

            That's doing the trick! Thanks very much, Peter.

            I'll still have to do some studying why exactly this works: 'single line mode' is not the first thing that comes to mind when you're addressing multiple lines, but I guess there is a logical explanation for it.

            I guess things went wrong for me because of the missing question mark in '+?'. I don't quite understand how this 'shortest match' works. Is it the shortest string of characters before the next condition ([/indent]) is found? And without 'shortest match', is [/indent] part of what it finds when looking for 'any character', so the search program will never get to the condition ([/indent])?

            • 3. Re: GREP search for all characters
              [Jongware] Most Valuable Participant



              (1) "Single Line Mode" is named so because it basically makes GREP ignore hard returns. Historically, GREP used to only work on 'single lines' -- it had no concept of "hard return", and the 'end of the line' was essentially always the 'end of input'. When it got adjusted to work on multiple *paragraphs*, separated by returns, the returns themselves took the place of 'end of input'. Enabling "Single Line Mode" treats the return as a regular character, and so the entire story is considered one very long line. ("Treats the return as a regular character", in this context, means that a wildcard such as '.' -- "match any character" -- now also matches a return.)


              (2) "Shortest match" is to distinguish it from its default, "Longest match". By default, GREP is Greedy. With an input string "a12a34a", the GREP 'a.+a' will match the entire string. Exactly as asked, it starts with the "a", matches one or more 'any' characters (the part "12a34"), then ends with "a" again. No arguing with that logic. However, if you only want it to match from the first to the second "a", you need to tell it to use the shortest (still possible) match: "a.+?a". This will match only "a12a".


              The first, "longest", match still needs to be possible. That is, it will not attempt "12a34a" for the part ".+" because a single "a" must follow. Therefore, the longest possible match is "12a34".


              For an in-depth guide to GREP in InDesign, Peter Kahrel's "GREP in InDesign" (http://shop.oreilly.com/product/9780596156015.do) is highly recommended. I read in one of its reviews,


              Every InDesign user should understand and use GREP, and this book is the best way to master it.
              • 4. Re: GREP search for all characters
                Peter Spier Most Valuable Participant (Moderator)

                Everything I know about GREP I learned from Jongware and Peter Kahrel's book. I'll second the recommendation.

                • 5. Re: GREP search for all characters
                  Fritschy Level 1

                  Thanks again, I've ordered the book.