2 Replies Latest reply on Mar 13, 2013 6:08 AM by Andreas Jansson

    InDesign GREP and alternatives in a positive lookbehind

    Andreas Jansson Level 2

      When using positive lookbehind (?<=  ... )

      such as this...

      (?<=open|enter).+
      

      ... to get a match on efterything following the words "open" or "enter", I found a small problem with the grep implementation in InDesign (or so I think).

       

      Requiring that either word exists, I find no other solution than to set up two different grep expressions, which is acceptable, but in my eyes not very good:

      Capture2styles.JPG

       

      My example is simplified here.

       

      It seems that the two "alternate" words, need to be the same length, to get a match. And before sending this as a question, I found that Jongware had this in his GREP help:

      (?<=text) (positive lookbehind) fixed length
      (?<!text) (negative lookbehind) fixed length

      From http://www.jongware.com/idgrephelp.html

       

      So adding a space after "open ", would get me the right result, since both words are now of the same length:

      (?<=open |enter).+
      

       

      Capture.JPG

       

      But in my real world scenario that's not possible (since the words are of completely different lengths).

       

      *** Less important extra information about non Adobe GREP starts here ***

       

      On this page: http://caspar.bgsu.edu/~courses/Stats/Labs/Handouts/grepadvanced.htm, I found a piece of text dealing with this exact issue, but according to the text, different lengths of the string are allowed, as long as they all have a fixed length (i.e. no wildcards in side the lookbehind expression):

       

      The contents of a lookbehind assertion are restricted such that all the strings it matches must have a fixed length. However, if there are several alternatives, they do not all have to have the same fixed length. Thus

          (?<=Martin|Lewis)

      is permitted [...]

       

      *** Unimportant extra information about non Adobe GREP ends here ***

       

      The author of the caspar.bgsu.edu page above also writes that several assertions can be added one after another:

       

      Several assertions (of any sort) may occur in succession. For example,

          (?<=\d{3})(?<!999)foo

        matches "foo" preceded by three digits that are not "999". Notice that each of the assertions is applied independently at the same point in the subject string.

       

      But I tried that as well, using this GREP with separate assertions for the positive lookbehinds in my example:


      (?<=open)(?<=enter).+

       

      ... resulting in no match at all:

      CaptureNoMatchSepAssert.JPG

       

      I don't know what point this is in Jongwares nice GREP flavour of Indesign help file, it might be there, just that I don't know the correct terminology.

       

      Not much of a question this, but perhaps it could be a subject for a discussion anyway. Not the least if my observations are incorret.

       

      Thanks,

      Andreas Jansson

        • 1. Re: InDesign GREP and alternatives in a positive lookbehind
          [Jongware] Most Valuable Participant

          Andreas, there are different implementations of GREP, so your second source might be referring to another engine. (According to Peter Kahrel, the GREP engine used in InDesign comes from the code library "boost".)

           

          However, the example you cited

           

          (?<=\d{3})(?<!999)foo

           

          works correct in InDesign as well, because there is no"different lengths" issue! Both terms consist of three characters, and additionally these are two different lookbehinds -- not a single one, as yours is! And it works because the first assertion (?<=\d{3}) also matches "999". Then the second assertion makes it fail on '999' only. At a glance, there is no easy way to write this as a single lookbehind -- not without nesting even more lookaheads and lookbehinds inside the lookbehind, I think.

           

          Your proposal based on that example

           

          (?<=open)(?<=enter).+

           

          doesn't work because, well, logically the text is either "open" OR "enter". Yours checks for both at the same time The trick indeed is to use the OR operator, which you tried first, but in a slightly different way. The lookbehind problem only occurs when you are trying two different lengths in one single lookbehind:

           

          (?<=open|enter)foo

           

          has one lookbehind operator, with two different length strings. Re-writing it as a logical OR of the two strings

           

          ((?<=open)|(?<=enter))foo

           

          will make it work correctly. In the image you can see the problem: the top GREP (yours) leaves the 'r' of 'enter' in red, because (apparently) the shortest sequence of the two is processed. The bottom shows the correct behavior.

           

          lookbehind.png

          • 2. Re: InDesign GREP and alternatives in a positive lookbehind
            Andreas Jansson Level 2

            Great explanation!

            And now the paragraph style is working just as it should, with a single line of GREP .

            Thank you!