You'd have to do it on a case by case basis.
It would have to be
This isn't the Ideal way to do it.
Best bet is to go to Preferences>Spelling and only turn on the "Repeated" words
Then do a spell check - and replace it that way.
You can also turn on Dynamic Spelling to highlight duplicated words.
I'm sure it could be scripted to find duplicated words - perhaps try the scripting forum.
The syntax of a Find string is different from the Change string. Use this syntax instead:
-- but it will find one or more consecutive same characters separated by a space. E.g., it will find your "find find" but also the "t t" in "at the". If you want to find entire words only, a good first attempt would be:
but it will fail for some fairly common phrases. It'll find "for foreign lands", "in international relations" and "be better". To exclusively find duplicate words, use this:
or this; slightly better because it will find triplicates as well:
Edit: Hi Eugene.
Hm. Due to the definitions of both "\w" ('what is a Word character) and "\b" ("what is a Word Break), my proposed GREP will also find this
it's on one one-sided page
Whether or not this is a valid 'duplicate word' depends on how you define "word". For instance, if "one-sided" is a single word, you can use this to have it not match the above:
You are effectively adding the character '-' to the "Word Character" set on the right. But, in that case you also should add it to the left! Otherwise, it will still (falsely) match
it’s an add-on on your system
Changing the Word Character set on the left as well will make it ignore this occurrence. But note you cannot use \b anymore! It would still pick up the '-' character as a valid 'word break', and the expression [-\w]+ that follows it would never see the hyphen. That leads us to this:
... I think I'm going to leave adding the slash as well to you ...
I did not know you could do that!
Thank you, Jongware. This is exactly what I was looking (and hoping) for. For my purposes, the string
is just perfect. I can modify it to even find identical words separated by punctuation:
Eugene Tyson wrote:
I did not know you could do that!
(g) You can have a bit of fun with it. This will find 5-letter palindromes in your text:
(e.g., "level", "civic", "refer"). Unfortunately there is no any-length GREP to find them
This will find three consecutive same characters:
-- heh heh, I just found a "Classsroom" in the book I'm working on!
Unfortunately there is no any-length GREP to find them
and there isn't (as far as I know), but this comes close:
It will find any palindrome word from 2 to 13 characters, and by extending the counting up to 9 it would be able to find as much as 19 characters!
Testing on the 109,000 entries in my English words list ... "deified" -- 7 letters. "malayalam", 9 letters! (That's a language, by the way.) "reviver" and "rotator" are also nice ones.
<g> Well as it is Friday afternoon:
This monstrosity will flesh out palindromes regardless of punctuation:
(?i)(?=[[:alpha:]])(\w)?[.,:;'? ]*?(\w)?[.,:;'? ]*(\w)?[.,:;'? ]*(\w)?[.,:;'? ]*(\w)?[.,:;'? ]*(\w)?[.,:;'? ]*(\w)?[.,:;'? ]*(\w)?[.,:;'? ]*(\w)[.,:;'? ]*\w?[.,:;'? ]*(?(9)\9[.,:;'? ]*)(?(8)\8[.,:;'? ]*)(?(7)\7[.,:;'? ]*)(?(6)\6[.,:;'? ]*)(?(5)\5[.,:;'? ]*)(?(4)\4[.,:;'? ]*)(?(3)\3[.,:;'? ]*)(?(2)\2[.,:;'? ]*)(?(1)\1)
The limit of 19 characters -- 9 at the left, one in the middle, same 9 at the right -- can be seen in this snippet from the Wikipedia article on palindromes:
Jongware, you truely need a vacation somewhere far away from computers.
The fact that one can use \1 this way is so inspiring! Now I have a follow-up question.
I am working on dictionary files, and a typical entry is "x ... y ... z ... z ... x ...; z ... §". Each of (x|y|z) stands for a particular word, separated by any text (...), and § is a new paragraph.
Now I should find a string of two consecutive occurrences of the same word in the same paragraph that are not separated by a semicolon (and not separated by any other "word"). In the example above, I should only find "z ... z". How do this?
but this does not work ...
Any ideas? Or is the \1 trick not working in this case?
It does work, in the sense it finds the first 'x' and then anything in between until the next 'x'. Since your '...' can be any text -- just not a semicolon and not the section sign --, it eats up everything up to the next 'x'. That matches precisely what you describe:
... a string of two consecutive occurrences of the same word in the same paragraph that are not separated by a semicolon (and not separated by any other "word") ..
Adding a '?' to make it match the shortest possible match does not add anything useful, since the entire string from the first 'x' to the next one already is "as short as possible".
Can you show a real world example of what you are attempting to find?