Copy link to clipboard
Copied
HI,
I have document which contains Chinese and English languages. For both we have separate fonts to be used. I am trying to find if any Chinese language texts applied with English language font or not. If anything present i need to apply some specific color.
app.findGrepPreferences = null;
app.findGrepPreferences.findWhat = '[a-zA-Z0-9]';
app.findGrepPreferences.appliedFont="ITC Avant Garde Gothic Std";
found = app.documents[0].findGrep();
for (i=0;i<found.length;i++) {
if(found.contents){
myarray.push(found.contents);
app.changeGrepPreferences.fillColor = "Chinese";
found.changeGrep();
}
}
alert(myarray)
How I can found the other languages like how i found english character in my find and replace above coding.
Thanks,
K
Well, yes, sure, you can do that too. Instead of:
app.findGrepPreferences.findWhat = '[a-zA-Z0-9]'
Just use
app.findGrepPreferences.findWhat = '[^a-zA-Z0-9]'
If you're sure that's what you want.
Copy link to clipboard
Copied
Hello Experts,
May i get any suggestions?
Copy link to clipboard
Copied
Hello Experts,
Still not fount any results from my end, May i get any suggestions?
Copy link to clipboard
Copied
You'll need to put together a list of all possible Chinese characters. You can search for ranges of characters in GREP using their Unicode values. So in the script above, something like this will find the basics:
4E00}-\\x{9FFF}]
'; But to be more thorough, you will have to decide exactly which Chinese characters/special characters/punctuation you're looking for. It looks like a big discussion. See here for example: cjk - What's the complete range for Chinese characters in Unicode? - Stack Overflow
Ariel
Copy link to clipboard
Copied
Hi,
Which OS and version of InDesign are you using?
P.
Copy link to clipboard
Copied
I am using Indesign CS6 and the OS details shown below..
Copy link to clipboard
Copied
I have a plugin that highlights languages in a document. P.M. me, I will send it to you later today.
Actually, could you use the style highlighter script?
Indiscripts :: The Hidden Way to Highlight Styles
P.
Copy link to clipboard
Copied
Hi Pickory,
Just sent PM to you. Yes the highlighter script wont help in this regards.
Copy link to clipboard
Copied
Hi Ariel,
Thank you for the reply. But the boolket contains both english and the relevant Chinese words. So can't list all those in GREP
Regards,
K
Copy link to clipboard
Copied
You don't need to list all the words! You just need to list all the possible characters! Get a list of all possible characters. Type them into the Grep find field. Set the formatting to the English font. Now do a GREP search.
The result will be all Chinese characters that have the English font applied to them.
QED
Copy link to clipboard
Copied
HI Taw,
Thanks, but i am afraid if i missed any list of character. I thought if we can do the GREP search except we listed in my original coding!
Like instead his if(found.contents){
can use like if(!found.contents){
Copy link to clipboard
Copied
Well, yes, sure, you can do that too. Instead of:
app.findGrepPreferences.findWhat = '[a-zA-Z0-9]'
Just use
app.findGrepPreferences.findWhat = '[^a-zA-Z0-9]'
If you're sure that's what you want.
Copy link to clipboard
Copied
Copy link to clipboard
Copied
TᴀW wrote
… If you're sure that's what you want.
Hi Ariel,
hm…
I'd be not so sure. No, I don't think that's what Kartik wants.
My screenshot below is showing some dummy text where a condition is applied to the found text:
I think the correct way would be to seach for unicode ranges like you already suggested in reply 3 .
And apply something to it. A condition like the one I presented in my screenshot would be great.
And one could chose a color for the condition that also can print or exported with a PDF.
FWIW: There is really no need for applying a character style.
On the contrary: We should spare character styles for other kinds of formatting, because maybe character styles are already in use and it would be destructive to apply a character style for the found Chinese characters.
The used GREP could include ranges for blocks containing Han Ideographs like suggested at Github.
Block | Range | Comment |
---|---|---|
CJK Unified Ideographs | 4E00-9FFF | Common |
CJK Unified Ideographs Extension A | 3400-4DBF | Rare |
CJK Unified Ideographs Extension B | 20000-2A6DF | Rare, historic |
CJK Unified Ideographs Extension C | 2A700–2B73F | Rare, historic |
CJK Unified Ideographs Extension D | 2B740–2B81F | Uncommon, some in current use |
CJK Unified Ideographs Extension E | 2B820–2CEAF | Rare, historic |
CJK Compatibility Ideographs | F900-FAFF | Duplicates, unifiable variants, corporate characters |
CJK Compatibility Ideographs Supplement | 2F800-2FA1F | Unifiable variants |
But the ranges above are not sufficient as the following little experiment is showing:
I copied some Chinese text—no idea what it is saying—from the net to my InDesign page and ran the following GREP on it:
[\x{4E00}-\x{9FFF}\x{3400}-\x{4DBF}\x{20000}-\x{2A6DF}\x{2A700}-\x{2B73F}\x{2B740}\x{2B73F}\x{2B820}-\x{2CEAF}\x{F900}-\x{FAFF}\x{2F800}-\x{2FA1F}]
What obviously is missing are punctuation characters, brackets and quotation marks.
Plus—at least in this text sample—a simple blank character that maybe should not be there ( second line of the Chinese text ).
Source of the text:
Don't know if I'm on the right track.
Hope, that helps.
Regards,
Uwe
Copy link to clipboard
Copied
Hi Uwe,
Thank you for your interest in this thread. I am quite limited access in client file. So applying conditional text is probably not possible in this case.
So that i am searching any simple grep solution.
Thanks again,
K
Copy link to clipboard
Copied
Hi Kartik,
applying conditional text is just one option out of some to mark the found text visually.
If you are sure, that there are no character styles applied to the text in the document, then go ahead with assigning a character style to the found text.
Since I have no access to your document(s) I'd say there is no simple GREP solution.
Here an example where I expanded the range of characters a bit, but still it would not cover all necessary ones with my little example:
We could still use some lookarounds to catch the missing ones.
Not so easy…
Regards,
Uwe
Copy link to clipboard
Copied
Hi Uwe,
Yes i have some character styles applied within the document. Also my idea is not getting the grep list of all Chinese characters, because i may miss some thing anyhow. So better am checking whether the character is english or not.. if no then i marked with swatches.
Thanks,
K
Copy link to clipboard
Copied
As you can see from my example, the pasted Chinese text is using some typical characters that are ALSO used with English text.
Among them a pair of brackets ( 0028 and 0029 ) and a "stray" blank ( 0020 ). So it's not "English or not" whatever English means. E.g. German would share the same characters with English. But not the other way around.
Without seeing your document I am running out of suggestions.
Regards,
Uwe
Copy link to clipboard
Copied
Hi Uwe,
Thank you for your suggestions It is really helpful. Yes the punctuation as well as a problem. Using Grep will reduce the manual work little, but need to look the other characters like punctuation separately.
Feeling bad because not able to provide the client document to you. Sorry about that.
Regards,
K
Copy link to clipboard
Copied
( No need to feel sorry… )
Perhaps there are also other problems ahead:
One of the toughest things could be to find out if some text that is meant for English is falsely typed with FULLWIDTH characters somewhere in the range FF10 to FF5A. Then you would need to map FULLWIDTH characters to "normal" characters, if you like to apply "ITC Avant Garde Gothic Std" that would not contain FULLWIDTH characters.
Same for FULLWIDTH digits perhaps.
E.g a FULLWIDTH DIGIT ZERO ( FF10 ) could be mapped to DIGIT ZERO ( 0030 ).
But that will depend on the individual case of course.
Regards,
Uwe
Copy link to clipboard
Copied
Hi TaW,
Yes for now i just need to exclude something in my GREP search. For now it is great and helpful to go ahead of my next step.
Thanks,
K