Hi all,
I made a script to remove a range of diacritics from selected text (The Squiggly bits at the top and bottom of letters) which works but I thought could be made more efficient by using findText().
My question is: Can one search for a range of unicodes (or for that matter a list of words like mom, mum, mommy, mummy, mam etc.) so that they can be deleted or changed to the same thing, (in the case of the word list to mother). without have to loop through every character in the selection?
my script is:
#target "InDesign"
app.doScript("main()", ScriptLanguage.javascript, undefined, UndoModes.FAST_ENTIRE_SCRIPT, "Remove Vowels");
function main()
{
var cc, t, w, x, d, q;
cc=0
t = app.selection[0];
w = new Array;
x = new Array;
for(d=0; d<t.characters.length-1; d++){
w[d]=t.characters[d];
try{
myCharacter= w[d];
myChar=myCharacter.contents;
unicode=myChar.charCodeAt (0);
// Unicode range to remove
if (((unicode > (0x0590) && unicode < (0x05BE))||
(unicode > (0x05C0) && unicode < (0x05C3))||
(unicode > (0x05C3) && unicode < (0x05C6)))||
unicode == (0x05BF)||
unicode == (0x05C7))
{x[cc]=d; cc++}
else
}
catch (noUnicode) {};
}
q=cc-1;
while (q>-1){
try {
w[x[q]].remove();
}
catch (error) {};
q--;
}}
I would also like to know if one can change a unicode range or word list using the regular indesign find / change interface?
Thanks in advance.
Trevor
Peter
Brilliant, I was at least 90% sure that you would the one to answer.
In script it goes
var mySelection = app.selection[0];
app.findGrepPreferences = NothingEnum.nothing;
app.changeGrepPreferences = NothingEnum.nothing;
app.findChangeGrepOptions.includeFootnotes = false;
app.findChangeGrepOptions.includeHiddenLayers = false;
app.findChangeGrepOptions.includeLockedLayersForFind = false;
app.findChangeGrepOptions.includeLockedStoriesForFind = false;
app.findChangeGrepOptions.includeMasterPages = false;
//Unicode Range
app.findGrepPreferences.findWhat = "[<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>]";
app.changeGrepPreferences.changeTo = NothingEnum.nothing;
mySelection.changeGrep();
app.findGrepPreferences = NothingEnum.nothing;
app.changeGrepPreferences = NothingEnum.nothing;
I found the basic <0591> format in this forum by you from 5 years back http://21.adobe-scripting-indesign.overzone.net/find-change-using-unic ode-t1610.html
and your answer above gave away the missing details.
I guess it would be a very good idea to buy this book http://shop.oreilly.com/product/9780596156015.do
This script is countless time quicker than the above one.
Thanks a million.
Thanks Peter,
When I wrote about using the <0000> format I was referring to in scripting and not in the grep tab.
I think you must of seen this post in email form and missed the lines of scripting ![]()
In scripting
these three options work ![]()
app.findGrepPreferences.findWhat = "[<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>]";
app.findGrepPreferences.findWhat = "[\u0591-\u05BCx\u05c2x\u05C4x\u05C7x\u05BF]";
app.findGrepPreferences.findWhat = "[֑-ֿxׁxׂxׄxׅxׇ]";
This does not ![]()
app.findGrepPreferences.findWhat = "[\x{0591}-\x{05BD}x\x{05C1}x\x{05C2}x\x{05C4}x\x{05C5}x\x{05C7}x\x{0 5BF}]";
Don't know why.![]()
On the grep tab
the unlucky option is [\u0591-\u05BCx\u05c2x\u05C4x\u05C7x\u05BF] which does not work properly (in fact hardly works at all!).
[\x{0591}-\x{05BD}x\x{05C1}x\x{05C2}x\x{05C4}x\x{05C5}x\x{05C7}x\x{05B F}] scores top for readability
and both [<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>] and [֑-ֿxׁxׂxׄxׅxׇ] (which as you the [<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>] becomes [֑-ֿxׁxׂxׄxׅxׇ]) work but have the readability issue on the one hand and on the other hand are easier to enter if you can read them.
Anyway I'm quite please that from not knowing any way to use the grep tab or the script app.findGrepPreferences.findWhat = method (beside one diaritic at a time!), now I know 3 for each ! ![]()
Regards, Trevor
P.s. Plan to get the book later in the day!
If you write out GREP expressions in Javascript to use with findGrep/changeGrep, you must take into account that backslashes inside a Javascript string needs escaping. Therefore you need to double each of them:
\\x{0591}
(etc.)
The "exceptions" -- there are always some -- are \r, \t, and \n, but in fact those aren't as special as they seem. They get translated into literal character codes for Carriage Return, Tab, and Line Feed, and as it happens, those can be fed as well into the findWhat string, even though you cannot type them in the interface (after inserting them with your script, sometimes you can see the GREP find field struggle with trying to display the string).
You could try if the special Unicode GREP group "\p{Mn}" finds all of the non-spacing markers you want to get rid of -- I think this class of commands is mentioned in Peter's book as well.
Jongware
I should have been able to figure out the escaping of the \,
Oh well better luck next time.
So now I have another 2 methods for the scripting:
app.findGrepPreferences.findWhat = "[\\x{0591}-\\x{05BD}x\\x{05C1}x\\x{05C2}x\\x{05C4}x\\x{05C5}x\\x{05C 7}x\\x{05BF}]";
and
app.findGrepPreferences.findWhat = "\\p{Mn}";
I did try the \p{Mn} method in the script but it didn't work because I didn't escape it.
and in the grep tab another one
\p{Mn}
Well sundenly overwhelmed with choice the winning script is:
var mySelection = app.selection[0];
app.findGrepPreferences = NothingEnum.nothing;
app.changeGrepPreferences = NothingEnum.nothing;
//Unicode Range
app.findGrepPreferences.findWhat = "\\p{Mn}";
app.changeGrepPreferences.changeTo = NothingEnum.nothing;
mySelection.changeGrep();
app.findGrepPreferences = NothingEnum.nothing;
app.changeGrepPreferences = NothingEnum.nothing;
Short and sweet (and quick).
Peter
I kept my word about getting the book and you can test me on the 37 \p{} methods tomorow ![]()
T Y
I think by comparing my original and this finial script, one can see an excellent example of how well and how poorly a script can be made.
Well I'm happy I saw there was a problem and didn't have that "Very British attitude
" and did complain, and something did change!
(see towards the bottom of http://21.adobe-scripting-indesign.overzone.net/find-change-using-unic ode-t1610.html)
Peter Kahrel wrote:
Ah, yes, the unicode properties \p{ }. They're quite useful. Two of my favourites are \p{Zs} 'all spaces except tab and return' and \{Pd} 'all hyphens and dashes'. And yes, all 37 of them described in the book.
Wow. How have I gone this long without knowing about these? Guess I should have read your book. Here's another resource.
Jeff
It's never too late, Jeff
! That source you mention is indeed very good. It's where I first learnt grep, back in CS2 days. It's not InDesign-specific though, so not everything discussed there applies to InDesign. Good site nevertheless. Those codes are illustrated with an InDesign document here: http://www.kahrel.plus.com/indesign/grep_mapper.html
Peter
North America
Europe, Middle East and Africa
Asia Pacific