3 Replies Latest reply on Jan 4, 2013 9:35 AM by David W. Goodrich Branched from an earlier discussion.

    Find/Change multiple characters at once

    Jodi Clayton

      Hello all, I'm not sure if this is the right place to enquire, but I was looking for a quick bit of GREP help.

       

      I'm working on jobs in Mandarin and Korean but do a find/change to put all latin characters, letters, bullets and trademark signs back to a latin font. Currently I do 5 find/changes per type style (15 total) and was wondering if I could make it so that all 5 are together.

       

      Eg. I do one find/change for each of ^9, ^8, ^$, ^r, ^d, and was wondering if I could make one GREP search for all 5

        • 1. Re:  Find/Change
          David W. Goodrich Level 3

          The simple answer is the GREP string "[^9|^8|^$|^r|^d]" (sans quotes) will step through all the characters one by one in a single pass, and maybe that's enough for your purposes.  But have you considered using the language attribute to search for classes of characters?  I like to put strings of C, J, and K characters in their own character style, each with the appropriate language attribute in its definition; I apply these via the GREP search "[\x{2E80}-\x{9FBB}]+" This GREP doesn't distinguish languages so I must often make that choice by hand; and it doesn't get every last CJK, but those it misses tend to be chars. I want to know about anyway.  In the files I receive, the language attributes can be wildly unreliable, screwing up hyphenation for the alphabetic text.  If your text is largely CK with just a bit of alphabetic text then perhaps you could assign a character style to the latter to speed processing en masse.

           

          David

          • 2. Re: Find/Change multiple characters at once
            Peter Spier Most Valuable Participant (Moderator)

            I've branched (and edited the title) on this discussion to give it a home of it's own...

             

            I'm thinking that those were metacharacters for various symbols or wildcards (^8, ^9, ^d and ^r being bullet, any digit, trademark and registered trademark, respectively, in a plain text search query, though I don't recognize the ^$), in which case I think you might want to structure the query a bit differently than David is suggesting.

             

            I would perhaps try ~8|\d|~9|~r|[$] (based on the probably mistaken assumption that ^$ is a dollarsign, in which case it can be entered in the class as a literal).

             

            If we knew if these were in fact metacharacters or were meant to be literals it would help....

            • 3. Re: Find/Change multiple characters at once
              David W. Goodrich Level 3

              Sorry folks, my initial string confused metacharacters for standard text and GREP.  For text, "^$" is the wild-card for "any letter," corresponding to "\l\u" in GREP: ~8|\d|~9|\l|\u.  (I should have checked Mike Witherell's very useful chart). The brackets are unnecessary, unless you want to add a "+" afterwards, which my second string uses to step through strings of chars.

               

              All of these leave word-spaces untouched -- important in Korean though often needing removal from C and J text: I don't handle much K, but I might look ahead and behind word-spaces for a character in the Unicode CJK range (or just hangul?) in order to mark these.  Then there's punctuation, but that can be a real can of worms.

               

              David

               

              Message was edited by: David W. Goodrich (typo)