Skip navigation
Currently Being Moderated

Displaying International Characters

Jun 4, 2008 9:22 PM

Some users have been concerned about the fact that Buzzword does not display some international characters - ranging from Greek to Russian. This is accentuated by the fact that we have Buzzword users in well over 100 countries.

The problem occurs when users attempt to insert some international characters - say, the Greek letter omega - and Buzzword instead displays a dot on the screen. Here's what's going on, for anyone interested:

Like virtually all modern software, Buzzword adheres to the Unicode standard, where characters are defined with 16 bits, resulting in a total of over 65,000 possible characters.

However, unlike most desktop software, Buzzword must use something called "embedded fonts". This means that we can't read fonts off a user's computer, but instead we have to download fonts from our server.

This is where our challenge begins. A font family contains characters - called "glyphs" when drawn on the screen - for some portion of the 65,000 possible characters defined by Unicode. Each available character is downloaded as a small program containing instructions on how to draw the glyph. The instructions are relatively small, but each takes time to download - you can see evidence of this in our "loading fonts" progress bar.

For Buzzword to load relatively quickly, we need to limit the number of characters downloaded with each of our seven font families. Most people use far fewer than 65,000 characters, so for our first phase of deployment, we identified a couple hundred characters to download for each font family. Because our initial market focus was North America, we chose characters from Latin-1, the Western European character set.

The result: when a user attempts to enter the Greek letter omega, Buzzword recognizes the Unicode character but does not have the downloaded instructions to display the glyph on the screen. The little dot that is displayed instead is an indication that the requested glyph has not been downloaded with the font set.. If the user were to export the document to be read by a desktop program, the glyph would probably be displayed using the computer's fonts.

Longer term, we'll handle this differently by downloading fonts dynamically, based on the document's contents and a user's settings. In the meantime, we apologize to everyone who uses characters outside the Western European set. We will work to get you a solution as soon as we possibly can.
 
Replies
  • Currently Being Moderated
    Jun 10, 2008 4:23 AM   in reply to Tad Staley
    I hope you remember to work out how to handle right-to-left languages like Hebrew!

    also, you choose opentype fonts such as minion, but I see no ligatures, true small caps, ranging figures etc. Here is an ideal opportunity to make simple text documents look great. Heck if Windows notepad can handle, then why not a program created by Adobe?
     
    |
    Mark as:
  • Currently Being Moderated
    Jun 10, 2008 1:11 PM   in reply to Raphael Freeman
    Great suggestion Raphael, and we are continuing to work hard to add support for more fonts and languages.

    We appreciate your comments, suggestions, and feedback.
     
    |
    Mark as:
  • Currently Being Moderated
    Jun 15, 2008 8:10 PM   in reply to Tad Staley
    i'd like to suggest something *other* than a dot for missing glyphs. it too closely resembles an ellipsis (especially those used in math notation) for the average user to tell apart.

    i'd also like to strongly suggest that you guys make note of this issue, especially the part that's it's a known problem & only a short term one, on buzzword's help page inside of "hiding" it away in these forums ;-)

    thanks.
     
    |
    Mark as:
  • Currently Being Moderated
    Jun 19, 2008 11:53 AM   in reply to Tad Staley
    I'm one of those users.
    well, how about putting an "Other Languages" button, which leads to a variety of choices, from Greek to Russian (and in my case Persian); so user could set it's preferred language, which reduce the 65,000 possibilities to a very reasonable amount (probably as much as English) to download. also the language of each document must store in it's file as a meta tag (I'm not a software engineer, so I don't know what's the exact term), so it'd resolve the complication of which font to download when opening.
     
    |
    Mark as:
  • Currently Being Moderated
    Jul 28, 2008 12:25 PM   in reply to Tad Staley
    quote:

    Like virtually all modern software, Buzzword adheres to the Unicode standard, where characters are defined with 16 bits, resulting in a total of over 65,000 possible characters.

    Actually, Unicode (the standard) does not care about the number of bits.
    It has enough space to encode more than one million characters, and the current version (Unicode 5.1) already encodes more than 100,000 characters ( http://www.unicode.org/versions/Unicode5.1.0/)

    quote:

    Buzzword must use something called "embedded fonts".

    Nothing prevents Flash/Flex from using fonts "html style".
    In fact, Buzzword can add a "Generic sans-serif" font as an option (font-family: Verdana, Arial, Helvetica, sans-serif;) with zero effort.
    The document will not look the same on all computers, but this might be better than the current bullets.
    So this is not a "must".
     
    |
    Mark as:
  • Currently Being Moderated
    Jul 28, 2008 1:06 PM   in reply to Raphael Freeman
    For Raphael: it all goes back to the text rendering engine, which is part of Flash Player, so outside the direct control of Buzzword.

    But the good news is that FlashPlayer 10 has better text rendering support, including complext script, and right-to-left (was anounced at Max 2007).

    Sut the improvement will probably trickle down to all applications, in the end.
     
    |
    Mark as:
  • Currently Being Moderated
    Jul 30, 2008 12:48 PM   in reply to Tad Staley
    Unicode BMP covers various languages. It may not be practical for an application to support all the languages even though the underlying text model uses utf16 or ucs-4. Application may not have the support for line breaking or word boundaries or fonts or search features. The application must clearly convey all the languages that are suported. Some applications disable input typing for a range of chars. MS Office has a good UI of setting supported "editing_languages" in their office language settings tool. It has a list of languages that can be edited in the document.
    Buzzword also can have something similar (like other lnaguages button, mentioned above), and if user activates 2 or 3 languages, buzzword can download the resources related to those languages. This way user is also aware of what languages he/she can edit.

    Using san/serif/typeface font option of FP/AS3 is a good option so that font can be picked automatically if available on the system. Again, if system does not have a font or a range of glyphs in the font, problem comes again.
     
    |
    Mark as:
  • Currently Being Moderated
    Aug 8, 2008 8:35 AM   in reply to Tad Staley
    can we at least have one font with all the glyphs in it? Well not all, but I have a problem rendering Hungarian accents like ő and ű...

    thanks
     
    |
    Mark as:
  • Currently Being Moderated
    Aug 11, 2008 2:28 AM   in reply to Mollaka
    Dear Mollaka,

    At the moment Buzzword is only geared to the English language. However internationalization is planned but at the current time we cannot provide any time lines as to when this feature will be available.

    Regards,
    Lao Ma
     
    |
    Mark as:
  • Currently Being Moderated
    Aug 11, 2008 4:29 AM   in reply to Tad Staley
    i understand and accept...at least, the pdf and .doc previews are showing these chars so it is acceptable until then.
     
    |
    Mark as:
  • Currently Being Moderated
    Oct 18, 2008 4:06 PM   in reply to Tad Staley
    Very interesting writeup
     
    |
    Mark as:
  • Currently Being Moderated
    Oct 31, 2008 3:14 PM   in reply to Tad Staley
    I wonder when can I use it to edit Chinese articles?

    Thanks!
     
    |
    Mark as:
  • Currently Being Moderated
    Nov 26, 2008 11:05 AM   in reply to Tad Staley
    Adverts removed
     
    |
    Mark as:
  • Currently Being Moderated
    Dec 8, 2008 2:52 PM   in reply to Lao Ma
    To support the Hungarian Language, the following diacritical characters are needed:

    á Á, é É, í Í, ó Ó, ö Ö, ő Ő, ú Ú, ő Ő, ü Ü, and ű Ű

    Buzzword already supports á Á, é É, í Í, ó Ó, ö Ö, ú Ú, ő Ő, and ü Ü.

    So, to fully support Hungarian, Buzzword just needs to add support for ő Ő and ű Ű (i.e. these double acute characters).

    I hope this can be accommodated relatively soon as it's a seemingly small addition and would add yet another support language to Buzzword's repertoire.

    Many thanks!

    Rodger
     
    |
    Mark as:
  • Currently Being Moderated
    Dec 15, 2008 4:20 AM   in reply to Tad Staley
    Great information. thank you.

    advert removed by moderator
     
    |
    Mark as:
  • Currently Being Moderated
    Dec 15, 2008 4:59 PM   in reply to Tad Staley
    Turkish Characters?
     
    |
    Mark as:
  • Currently Being Moderated
    Feb 17, 2009 1:51 PM   in reply to Tad Staley
    I believe a lot of the languages would require only a few additional letters. German, polish, hungarian are just a few examples. Being only a few bytes in size total I don't think it would impact loading times in any meaningful way.
     
    |
    Mark as:
  • Currently Being Moderated
    Mar 12, 2009 8:06 AM   in reply to Tad Staley
    Would it be possible to arrange a way so that the Latin-1 section of the font files is dowloaded initially (as it is done now), and then other glyphs (or sets of glyphs) are downloaded "on demand"?
     
    |
    Mark as:
  • Currently Being Moderated
    Mar 12, 2009 9:24 AM   in reply to Ensjo
    I don't think there is a way to do this today.
     
    |
    Mark as:
  • Currently Being Moderated
    Mar 12, 2009 9:28 AM   in reply to hsuvarna
    quote:

    MS Office has a good UI of setting supported "editing_languages" in their office language settings tool. It has a list of languages that can be edited in the document.

    But by setting that option Office adds extra support for the selected languages, but does not prevent editing in other languages.
     
    |
    Mark as:
  • ereninan
    2 posts
    Oct 25, 2011
    Currently Being Moderated
    Oct 25, 2011 1:36 PM   in reply to Tad Staley

    Hi,

     

    It has been more than 3 years after this thread has been started, and as far as I can see international characters are still not supported. I find it hard to believe such a think would take this long to support. So I would like to update the question: are you still planning to support internartional characters?

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points