Skip navigation
Currently Being Moderated

Boolean (Conditional) Greps

May 6, 2012 12:11 AM

How can I make a condition for a grep?

 

For example all capitol letters except for the letter "S"

or all spaces except for hair width spaces.

 

Thanks

 
Replies
  • Currently Being Moderated
    May 6, 2012 3:51 AM   in reply to Trevorׅ

    As far as I know (but there are other users here with more knowledge of GREP) you can't do that. You can negate an entire class, or create a class of things you want to match, but you cannot create an exception to a class.

     
    |
    Mark as:
  • Currently Being Moderated
    May 6, 2012 4:11 AM   in reply to Trevorׅ

    The easiest way of excluding something with GREP is not to search for it in the first place

     

    [A-RT-Z] will match all regular capitals except for the 'S'.

     

    [~m~>~f~S~s~<~/~.~3~4~%\x{20}] matches all spaces except Hair Width space (this is the built-in Multiple Space to Single Space query, minus the Hair Width code ~|

     

    A more advanced way to specifically exclude something is using Lookahead or Lookbehind. Usually, you would put a Lookahead at the end of an expression, to match something only if it's followed by the Lookahead item, but you can do some tricks with it if you put it at another place:

     

    (?!S)\u

     

    This will match any uppercase character (including Greek and Cyrillic), but the Negative Lookahead ensures it will not match the exact character 'S'. Similarly,

     

    (?!~|)\s

     

    matches any whitespace except the hair width.

     

    If in addition you also want to exclude any accented form of the letter "S", you would use

     

    (?![[=s=]])\u

     

    -- the code [[=s=]] stands for 'any equivalent of "s"', which includes s and S, and also Ș and ś, and other accented forms (but not Cyrillic 'с' or Greek 'Σ', since those are not forms of the Latin S).

     

    These examples only work one character at a time. You can repeat them, but you must pay attention to what you repeat:

     

    (?!S)\u+

     

    will match "any uppercase character-not-S", followed by any arbitrary long string of all uppercases. The Lookahead is only tested for the first occurrence. To make it do what you (probably) intend, instead use

     

    ((?!S)\u)+

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points