-
1. Re: GREP searches from top of story?
P Spier Feb 3, 2010 11:05 AM (in response to RodneyA)What's the search string?
-
2. Re: GREP searches from top of story?
RodneyA Feb 3, 2010 11:38 AM (in response to P Spier)Well, having done a little more work, I think the problem may result from a look-behind that I added to the beginning of the string.
Basically, I have about 500 abstracts, each has a long list of authors, with punctuation as follows: SMITH, J.S., affiliation, email; JONES, S.J., affiliation, email; etc.
I need to get rid of the affiliations, emails, and semicolons; keep the Author name and initials; and apply a script of mine to put the last name in title case, resulting in
Smith, J.S., Jones, S.J., etc.
There are some typos and unusual names, so I need to do this one by one rather than with a global search, but I want to be efficient. I want to run my case-change script on the last name, then trigger a GREP search that will select the affiliation and email, which I can then look over to confirm, then delete. (Affiliation may be full of all kinds of punctuation).
Currently I'm using
.+?[;\r]
...which selects from the insertion point to the next semicolon or the end of the paragraph, whichever comes first. This one works, but I first need to move the insertion point past the initials, which takes more keystrokes.
Since the author initials always end with a period followed by a comma, but I don't want to delete those, I thought I'd add a look-behind so that the GREP would select all the text from (but not including) a period-comma to (and including) the next semicolon:
(?<+\.,).+?[;\r]
The problem is that instead of searching from the insertion point, it goes to the beginning of the file and grabs the first occurrence of the pattern, which is the first already-corrected address (since there are no semicolons left, it selects from just after the first author's initials to the paragraph end).
Anyway, if anyone has some ideas (including "sorry, GREP just behaves this way with look-behinds") I'd appreciate any help.
-
3. Re: GREP searches from top of story?
P Spier Feb 3, 2010 11:55 AM (in response to RodneyA)Except for the typo in the lookbehind code that expression is stepping through a story for me just fine, but I'm not doing anything but hitting the find next button. What are you doing inbetween?
-
4. Re: GREP searches from top of story?
P Spier Feb 3, 2010 11:59 AM (in response to RodneyA)OK, when I leaave the Find/Change dialog to do something with the found text manually it resets the dialog. I think for this to work you'd need to have a way to make the changes inside the dialog.
Maybe one of the real GREP experts has something better to say.
-
5. Re: GREP searches from top of story?
Tipo_grafo Feb 3, 2010 10:49 PM (in response to P Spier)It would be better to see some real examples of your text, but for now lets assume that each author part consists of two segments only: Surname <comma> initial(s) <comma>, and that each group of "author - affiliation - email" is delimited by a semicolon.
If that is the case, and the semicolons are not used inside the groups, I think this expression could work:
Find: ([^,]+, [^,]+,)[^;]+;
Change: $1
The group inside parenthesis captures the author's name, which is "saved"; the rest matches any subsequent character which is not a semicolon until the expression finds a semicolon.
After running the search and replace, you can just apply 'Change case' > 'Title case' to the remaining text, or run your script
-
6. Re: GREP searches from top of story?
pkahrel Feb 4, 2010 3:01 AM (in response to RodneyA)I think that this is indeed a case of "sorry, GREP just behaves this way with look-behinds", annoyingly. But you can get around that by searching To End of Story rather than Story or Document.
Peter




