Skip navigation
Currently Being Moderated

Can I choose a backup-font for missing characters in a style? Or is there a way to select pink areas

Nov 22, 2012 1:08 PM

I have to layout tons of UTF 8 data from a MySQL database.

There is one problem. The designer picked a font that does not contain the Japanese characters. But there is much inline-Japanese in the text.

So the imported text has many pink areas (characters not present in font).

 

So I search for any of the following options:

  • Is there a way to define a fall-back font (like webbrowsers do). If you view UTF in TextEdit (mac) or in the browser, all weird characters are displayed using Osaka. Is there a way to let Indesign replace pink squares with Osaka?
  • Is there a way to SELECT all pink squares, and apply a character-style to them? That way I can create a characterstyle with the Osaka font
  • Is there a way to have InDesign parse some sort of markup-elements in plain text? I could tweak the MySQL output with PHP to wrap all strange symbols in a markup-item like [japanese]blabla[/japanese]. If that would be converted to characterstyle "Japanese" on import, I would be helped a lot as well.
  • The font hack: is there a way to load all available characters in the designfont into Osaka and replace those characters and make Osaka2? Or would that result in a mess?

 

I think some automation to get a character style on the pink areas is the best, as that allows to tweak font height and baseline shift to match the other font.

 

Does anybody know one of the four or maybe a fifth way to tackle my problem?

 
Replies
  • Currently Being Moderated
    Nov 22, 2012 1:29 PM   in reply to m2d3x4

    This is a quiet day here on the forums because of the Thanksgiving holiday here inthe US, so there aren't a lot of usresponding.  I think what you want to do is possible using a GREP style as part of the paragraph style. What you want to do is apply your font to any character in a prescribed Unicode range, I think.

     

    The problem is I don't know exactly which range has the Japanese glyphs, and my GREP is not fluent enough to rattle off the correct expression, though I've done a little research and I think it should look something like [\x{5026}-\x{9D60}]+  (I probably don't have that range correct).  This may have been covered before if you search here, or with any luck Jongware or Peter Kahrel will see this (or one of our other GREP experts) and will offer up the correct expression.

     
    |
    Mark as:
  • Currently Being Moderated
    Nov 22, 2012 1:34 PM   in reply to Peter Spier

    Peter, it looks good. How's the turkey going down over there? (Calculating time difference..) Popped it out of the oven yet?

     
    |
    Mark as:
  • Currently Being Moderated
    Nov 22, 2012 1:43 PM   in reply to [Jongware]

    [Jongware] wrote:

     

    Peter, it looks good.

    Really? I'm quite surprised. A little googling tells me there are a number of different types of glyphs for different things, and they seem to have different ranges. I plead complete ignorance of typesetting oriental languages.

     

    Also a word of caution to the OP: In case Joel Cherney, our translation expert, doesn't show up, he would tell you that with complex scripts like this it is very important that you have the correct glyphs or you risk changing the meanings or making the author look like an idiot. If you don't read Japanese you would be well advised to have the document checked by someone who does to be sure it reads OK before it goes into the wild.

     

    I think the turkey should be ready in about half an hour. It's starting to smell pretty good...

     
    |
    Mark as:
  • Currently Being Moderated
    Nov 22, 2012 3:51 PM   in reply to Peter Spier

    Um, I didn't actually check your proposed Unicode range (But surely you didn't make it up didya?)

     

    Scanning older posts on same revealed I suggested this one for Chinese:

     

    [\x{3000}-\x{efff}]+

     

    .. and that was based on my guess that, even though this range includes a variety of other scripts as well, it's unlikely that, say, a Thai or Gujarati glyph will pop up in mixed English/Chinese text. (Even if it does and it's not caught by the replacement font, it's still not a disaster, because all you have to do is apply a font manually. Or add another GREP style.)

     

    I've always understood that "Japanese", as a script, does not exist -- there is your hip modern style Katakana (the flashy swashes you always see in commercials), sort of analogous to our Western character based script, and the venerable old long list of ancient Chinese-origins of Hiragana, which comes closer to the original syllable/word/phrase/homonym meaning we normally associate with Chinese.

    (Recently, Roman script or "Romaji" has become the thing to use in your average hipper-than-hip commercial, no doubt much to the chagrin of the venerable users of Hiragana, after having to learn all of those thousands of characters.)

     

    So, without knowing anything of the character range used in the OP's text, applying the over-filled set of Chinese glyphs (which thus also should include Katakana) surely wouldn't hurt.

     
    |
    Mark as:
  • Currently Being Moderated
    Nov 22, 2012 4:19 PM   in reply to [Jongware]

    I grabbed my range by opening the MS Mincho in the Glyphs panel and taking the JI590 Forms range. Kozuka Gothic Ppro has quite a few other Kanji groups, it seems, over a broader range...

     
    |
    Mark as:
  • Currently Being Moderated
    Nov 23, 2012 10:58 AM   in reply to m2d3x4

    Thanks for standing in for me, Peter - we were hosting the extended family, I did nothing but cook for hours.

    Is there a way to have InDesign parse some sort of markup-elements in plain text? I could tweak the MySQL output with PHP to wrap all strange symbols in a markup-item like [japanese]blabla[/japanese]. If that would be converted to characterstyle "Japanese" on import, I would be helped a lot as well.

     

    This is what I'd do, honestly.  It depends on your available output-from-DB formats, and what you're placing/importing into ID. But if you're placing raw text (?) then you can't simply use Type -> Find Font, so the GREP solutions offered here are good ones.

     

    I think some automation to get a character style on the pink areas is the best, as that allows to tweak font height and baseline shift to match the other font.

     

    Note that the "baseline" in Japanese is not the same place as the baseline in Latin-script. Japanese doesn't really have a baseline - they operate with a completely separate set of typographical conventions - but the effect is that you should not line up the bottom of your Japanese glyphs with the Latin-script baseline. Set some bilingual text in a good font (Kozuka Mincho Pro, it came with ID, you probably already have it installed) and take a look.

    it is very important that you have the correct glyphs or you risk changing the meanings or making the author look like an idiot. If you don't read Japanese you would be well advised to have the document checked by someone who does to be sure it reads OK before it goes into the wild.

    Chances that you'll make something completely illegible are minimal when handling Japanese, but there are a very large number of reasons why you can't just flow Japanese into English-language ID and expect it to work. There are permissible places to wrap Japanese, and places where it's not permissible. The paragraph composer you'd be using if you are using English-language ID will happily break Japanese in the wrong place. This is analagous to bad hyphenation, but looks far more idiotic to the Japanese readership than bad hyphenation looks to an English-language readership (who probably wouldn't notice bad hyphenation unlesss they happen to be a design-savvy readership). Look into using something like Japanese Indesign or World Tools Pro or the TransPacific Digital Japanese-composer template files.

     
    |
    Mark as:
  • Currently Being Moderated
    Nov 23, 2012 11:01 AM   in reply to [Jongware]

    I've always understood that "Japanese", as a script, does not exist

    This is totally true, but... exactly how much do you guys want to know about this stuff?   I put in a separate post because it's not really relevant to the OP, but I couldn't stop writing about it.

     

    You're pretty close, Theun, but a little bit off the mark. Japanese writing uses four different scripts, and you are correct when you call Japanese use of Latin script "romaji" - romaji is, for sure, is a Japanese writing system. It's Latin script, but adapted to Japanese use. Likewise, kanji are Chinese characters adapted to Japanese use.

     

    Hiragana and katakana are both in the category of "kana" - basically phonetic syllabaries - but katakana used to be used exclusively for loan-words from other languages.Hiragana is used to write Japanese language phonetically - but not whole sentences, or even whole words, if you're fully literate and know thousands of kanji. But there are only forty-something hiragana; I forget exactly how many there are, but they encode basically the same sounds as do katakana.

     

    In order to write Japanese well, you must use all three writing systems. The reason that line-breaks are so important in InDesign is because a correctly written Japanese word will often require both kanj and hiragana. Chinese script isn't really ideographic, but using emoticons (sometimes called "emoji" ) is useful to explain this to an audience of people who grew up with English: you use hiragana to tack conjugations and phrase-particles and such to kanji.  So a little kid would write "I am not laughing at you" strictly phonetically in hiragana, but an adult would know the kanji for the words that ought to be written in kanji and would instead write:

     

    "I am not ing @ you"

     

    with the symbols representing the kanji.

     

    If you wrap the "ing" down to the next line, then it will attach to the @, not the , causing Nonsense. Probably still intelligible, but still, it's the moral equivalent of writing English with no spaces and no hyphens. It just indicates that you are subliterate.

     
    |
    Mark as:
  • Currently Being Moderated
    Nov 24, 2012 11:17 AM   in reply to [Jongware]

    I often use ID's GREP to apply character styles to strings of CJK text, but my search for <[\x{2E80}-\x{9FBB}]+>.  The lower start includes the Kangxi radicals and CJK radicals supplement, which occasionally turn up in the text I see -- though few Chinese fonts offer them.  I probably should raise my top to 9FCC, technically the high end of "CJK Unified" chars. in the base plane (some "CJK Compatibility" forms run higher,  and include some full-width punctuation I like to use, though I usually replace any characters up there with more standard versions).  I'd avoid including the range "E000-F8FF": the Private Use Area, where alphabetic fonts tuck various oddities, and where I and others code some custom chars.

     

    David

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points