8 Replies Latest reply on Jan 11, 2013 1:49 AM by [Jongware]

    GREP Question:  Style em spaces that precede only certain triggers within a paragraph

    vazrick2 Level 1

      I've been asked to re-post this InDesign GREP question here under scripting...

       

      Hello,

       

      Up front:

      I do not know GREP well enough to begin to tackle this.  And as usual, I'm in a pinch!

      Also, if you're the visual type, just look at the second to the last paragraph at the bottom, and that may be enough to help you devise the GREP style.  Otherwise, bear with me, I have to think this through...

       

      I have a list that spans pages of events.  The events are formatted as follows:

      Box City EventNameSpecific Month EndPara (this is a single paragraph in a story of a hundred such paragraphs)

       

      On paper, each event will need to be preceded with a filled specific-colored box ().  I would like this box to be made from a strikethrough applied to an em space using grep, since that is very easy to create using a character style and only requres one setting (the thickness of the strikethrough).

       

      There is a legend at the bottom of the page, detailing about 14 types of events.  Some of the different types of events will require the same colored box.  But in all I have 7 different colors of boxes in the legend.

       

      I'm looking to create...

       

      • 7 character styles, one for each color (consider this done already)
      • 1 paragraph style (essentially done, but missing the GREP)
      • 7 GREP styles (or more as necessary) specifically to apply only to the em space that precede specific sets of text strings
        • If I need, or if it is simpler to create 14 GREP styles within the paragraph style, that's fine.  But I understand we may be able to do something like this... (Target phrase 1 | Target phrase 2 | Target phrase 3) in between parenthesis to identify multiple possible triggers within the same rule.
        • I'm fine whether it's 7 or 14 rules in there, I just want to be able to update them over the course of time as necessary - but it won't be for another season at least if I do.

      I don't know GREP well enough to piece together what I need.  Vaguely familiar with look aheads and look behinds.  I do understand it may be easier if the em space is in between other characters, so in my example below I've inserted hairspaces.  But ultimately I want the em space to be flush or nearly flush to the left of the frame. For my example below:

      • I only want to apply character styles to em dashes based on the strings in bold.
      • The strings of text that will trigger the GREP will vary in number of words and may or may not contain a dash.
      • I want to ignore Citynames regardless of number of words... Chicago vs. San Francisco vs. Vancouver, BC for example.
      • I want to ignore MonthofYear which appears after the second tab.
      • The tabs will have set stops as a regular part of the paragraph style.  I'm including them in my example below in case it helps to visualize possible anchors to use in GREP.
      • I want the style to be easy enough to modify, for example if I want to simply add an (EEEE New Eventname) to the list of possible options within a rule that triggers a specific character style.  See below should make it clearer.

      So here's what I'm thinking...Within the Paragraph Style, the first two examples of GREP Styles should be able to respond to the following criteria: Rule 1:  If the paragraph contains any of the following, then apply Char Style 1 to the em dashes in those paragraphs. Use a unique GREP expression for each phrase below where each uses Char Style 1 or use a single GREP expression to capture all three possible triggers:

      • ABC BB Invitational
      • ABC Xxxxxxxxxxxxxx
      • ABC MM-Xxxxxxxxxxxxxxx

      Rule 2: If the paragraph contains any of the following, then apply Char Style 2 … and so on

      • Global ABC Xxxxxxxxxxxxxxx
      • Global WXYZ Xxxxxxxxxxxxxxx

       

      I cannot count on “ABC” or “Global” being the trigger for the style, if you know what I mean.  I need the entire phrase (ABC BB Invitational) to be the trigger… if it exist in its entirety, then apply the style to the preceding em dash in that paragraph.  This way if there are any mispellings or if we launch a new event type which ends up flowing in to my document I will know it.

       

      hairspace  emspace  tabspace CitynameOneWord ABC BB Invitational tabspace MonthofYear

      hairspace  emspace  tabspace  Cityname TwoWrds ABC BB Invitational tabspace MonthofYear

      hairspace  emspace  tabspace  Cityname MultiWrds ABC BB Invitational tabspace MonthofYear

      hairspace  emspace  tabspace  Cityname TwoWrds ABC MM-Xxxxxxxxxxxxxx tabspace MonthofYear

      hairspace  emspace  tabspace  Cityname MultiWrds ABC Xxxxxxxxxxxxxx tabspace MonthofYear

      hairspace  emspace  tabspace  Cityname TwoWrds Global ABC Xxxxxxxxxxxxxxx tabspace MonthofYear

      hairspace  emspace  tabspace  CitynameOneWord Global WXYZ Xxxxxxxxxxxxxxx tabspace MonthofYear

      hairspace  emspace  tabspace  CitynameOneWord Global Special Xxxxxxxxxxxxxxx tabspace MonthofYear

      hairspace emspace tabspace  CitynameOneWord Globl Special WRONG SPELLING  tabspace MonthofYear


      I hope this makes sense and isn't too unnecessarily redundant.  Time for bed.  Fingers crossed someone will post at least one GREP string, so I have some magic code for tomorrow morning!  Ideally, it would be great if you would include a brief explanation of waht the string is doing, but minimally, please do use one of my text stings above, so I know what to mess with and what not to.  ;-)

       

      <says prayer>

       

      Thanks!

      Rick

        • 1. Re: GREP Question:  Style em spaces that precede only certain triggers within a paragraph
          Jump_Over Level 5

          Hi,

           

          I think hairspace is useless (in case of GREP at least).

          If I understood this well the difference key is :

          tab_any_ABC_any_tab

          tab_any_Global ABC_any_tab

          tab_any_Global Special ABC_any_tab

          etc

           

          My suggestion:

          In your paragraph definition is a list of GREP styles:

          1.  style: 1CharStyle

               condition: ~m(?=\t.+ABC.+\t)

          2.  style: 2CharStyle

               condition: ~m(?=\t.+Global ABC.+\t)

          3.  style: 3CharStyle

               condition: ~m(?=\t.+Global Special ABC.+\t)

          ....

          and so on as many times as many unique conditions you have.

          Those one with wrong spelling should stay untouched.

           

          hopefully...

          • 2. Re: GREP Question:  Style em spaces that precede only certain triggers within a paragraph
            vazrick2 Level 1

            Thanks Jump_Over!

             

            Your suggestion worked.  I have 17 distinct Grep Styles in 1 single Paragraph Style to effectively color em dashes (as colored bullets referred to in a legend) using 8 different character styles (for colors) across 128 events/paragraphs.  You have no idea how good it is going to feel NEXT YEAR when we import another 128+ events from Excel and do a find/replace to insert em dashes and tabs, followed by the application of a single paragraph style!  Actually, you probably do.

             

            Prior to that, whatever designer needed to update the file was manually placing indivual anchored object boxes next to each event one at a time.

             

            Slight modifications to the above suggested rules to make this work for me...

             

            1.  style: 1CharStyle

                 condition: ~m(?=\t.+ABC exact text same+\t)

            2.  style: 2CharStyle

                 condition: ~m(?=\tGlobal ABC exact text same+\t)

             

            List of things different from above posted examples:

            • I removed the period . from all of the Grep styles that appeared just before the +\t)
            • Where ABC exact text same appears with and without Global preceding, I removed the .+ before Global to make it work properly.
            • Also in such instances, I made sure to order those with Global to appear after those without.

             

            I still don't understand exactly how it all works.  It was intuition that had me guess at why it wouldn't work if they weren't ordered properly.  And for example, in some instances we had this: " ABC exact text same (launch) " and the Grep Style as-is would not work.  I played with it before giving up, and was fortunate to learn from management they actually wanted " (launch) " removed, so no problem, this time.  But I did learn some valuable things.

             

            Now on with production.

             

            Thanks again, Jump_Over, until next time!

             

            -Rick

            • 3. Re: GREP Question:  Style em spaces that precede only certain triggers within a paragraph
              Peter Spier Most Valuable Participant (Moderator)

              That's interesting. I was under the impression that .+ was not valid in a lookahead/lookbehind.

              • 4. Re: GREP Question:  Style em spaces that precede only certain triggers within a paragraph
                vazrick2 Level 1

                Hmmm... I have no historical experience.  It seems to work at the beginning, but not at the end.

                • 5. Re: GREP Question:  Style em spaces that precede only certain triggers within a paragraph
                  Peter Spier Most Valuable Participant (Moderator)

                  I would not expect that expression to work unless your trigger text immediately follows the tab. Did you get it to work with intervening cities? That's why I answered the other thread the way I did and suggested coming here. I think it's scriptable, just not GREP stylable.

                   

                  But I've been wrong before.

                  • 6. Re: GREP Question:  Style em spaces that precede only certain triggers within a paragraph
                    Harbs. Level 6

                    Variable length lookarounds generally do not work.

                     

                    Here's a quote from http://www.regular-expressions.info/lookaround.html:

                     

                    Therefore, many regex flavors, including those used by Perl and Python, only allow fixed-length strings. You can use any regex of which the length of the match can be predetermined. This means you can use literal text and character classes. You cannot use repetition or optional items. You can use alternation, but only if all options in the alternation have the same length.

                    PCRE is not fully Perl-compatible when it comes to lookbehind. While Perl requires alternatives inside lookbehind to have the same length, PCRE allows alternatives of variable length. Each alternative still has to be fixed-length.

                    Java takes things a step further by allowing finite repetition. You still cannot use the star or plus, but you can use the question mark and the curly braces with the max parameter specified. Java recognizes the fact that finite repetition can be rewritten as an alternation of strings with different, but fixed lengths. Unfortunately, the JDK 1.4 and 1.5 have some bugs when you use alternation inside lookbehind. These were fixed in JDK 1.6.

                    The only regex engines that allow you to use a full regular expression inside lookbehind, including infinite repetition, are the JGsoft engine and the .NET framework RegEx classes.

                    Finally, flavors like JavaScript, Ruby and Tcl do not support lookbehind at all, even though they do support lookahead.

                    • 7. Re: GREP Question:  Style em spaces that precede only certain triggers within a paragraph
                      vazrick2 Level 1

                      Well, this is what worked...

                       

                      1.  GREP style: 1CharStyle

                           condition: ~m(?=\t.+ABC exact text same+\t)

                      2.  GREP style: 2CharStyle

                           condition: ~m(?=\tGlobal ABC exact text same+\t)

                       

                      It worked in instanced where...

                       

                      Rule 1 (color results in let's say a blue emdash)

                      emdash TAB San Francisco(variable city) ABC exact text same TAB endpara

                      emdash TAB Chicago(other variable citty) ABC exact text same TAB endpara

                       

                      Rule 2 (color results in gold emdash)

                      emdash TAB Global ABC exact text same TAB endpara

                       

                      In Rule 2 "Global" is part of the text string.

                      Nothing directly precedes the text string except the TAB.

                       

                      So the t.+ or .+ as it were seems to have the affect of allowing variables between the TAB and the triggering text string.

                       

                      Interestingly Rule 2 does not work if I include the .+ because the rule assumes there will be a variable preceding "Global" and not finding one, fails to properly color the emdash.

                       

                      Referencing Harbs's comment below, the variable length lookaround is working (in this case) at least for now.  And I have it working multiple times applied the same way as above for different one-off text strings (with/without Global) – all within the same paragraph style.

                       

                      So far so good!

                       

                      -Rick

                      • 8. Re: GREP Question:  Style em spaces that precede only certain triggers within a paragraph
                        [Jongware] Most Valuable Participant

                        vazrick2 wrote:

                         

                        .. So the t.+ or .+ as it were seems to have the affect of allowing variables between the TAB and the triggering text string.

                        Interestingly Rule 2 does not work if I include the .+ because the rule assumes there will be a variable preceding "Global" and not finding one, fails to properly color the emdash.

                         

                        The 't' doesn't do anything special here (it's the Tab code, '\t'), but it's the '+' that got you.

                         

                        The period . is the standard wildcard -- any single character. You follow it with a plus, which means one or more. So the GREP now only matches if you have a tab, followed by one or more characters, followed by "Global (etc)".

                         

                        If there may or may not be any text between the tab and the fixed text string, use the modifier '*' instead in both cases -- zero or more of the preceding code (the 'any character' wildcard):

                         

                        1. GREP style: 1CharStyle

                        condition: ~m(?=\t.*ABC exact text same+\t)

                        2. GREP style: 2CharStyle

                        condition: ~m(?=\t.*Global ABC exact text same+\t)