7 Replies Latest reply on Apr 13, 2011 7:43 AM by Steve4Flex

    Solution for encoded UTF-8 special characters

    PeakDigital Level 1

      I am building an extensive application system and am trying to keep everything clean, standardized and secured. Part of that effort has entailed encoding all non-standard or special characters into their Unicode notation before transmitting to the database. I chose this route because I didn't wish to utilize any HTML standards in this process, so using "<" was not the route I wanted.

       

      Anyway... encoding the special characters prior to transmission to the database went well. However, I just spent about 3 hours trying to figure out how to go the other direction, to turn the "U+003C" back into a <.

       

      I may have overlooked something obvious in the documentation, but I kept trying to use the String.fromCharCode method and it just didn't work - all sorts of odd problems. I finally stumbled across the notation of \u003C in a Moock book and tried it in the fromCharCode, but that didn't work either.

       

      Finally, I realized it didn't need to be treated as a number and just dumped it into a string variable like this:

       

      var charLessThan:String="\u003C";
      

       

      After defining the characters into string variables, then I can process the whole list of special characters with lines like this:

       

      stringTranslated = stringTranslated.replace(utfSymbolLessThan,charLessThan);
      

       

      Hopefully this can help some other bleary-eyed programmer out there.  And if anyone has suggestions to improve on this I'd like to see them.

       

      Thanks.

      Paul

        • 1. Re: Solution for encoded UTF-8 special characters
          GordonSmith Level 4

          Why aren't you just storing UTF-8 in your database? Are you using a database that can't store Unicode?

           

          Gordon Smith

          Adobe Flex SDK Team

          • 2. Re: Solution for encoded UTF-8 special characters
            PeakDigital Level 1

            Gordon,

             

            I can't really tell what is happening. My database is MySQL 5, and each table is set for UTF-8 encoding. It may be a problem with my database interface (Navicat) rather than the database itself. Whenever I run a query on it, any special characters show up in the console as "?" instead. In fact, if I paste a special character into the table grid view, it shows properly until I commit the record change, then turns it to "?".

             

            I just tried again, pasting the down arrow character (U+2193) into a field that is used as the labelField for a ComboBox control. It shows up as "?" in my Flex app.

             

            And, I can't try inserting it via Flex since I can't get them to show up in my Flex app despite specifically including the proper character range in my CSS.

             

            Thanks for your time.

             

            Paul

            • 3. Re: Solution for encoded UTF-8 special characters
              GordonSmith Level 4

              If your Flex app can't display Unicode characters properly, it is probably because you're using a font that doesn't have the characters you need. Not all fonts have all Unicode characters. You should find a font-inspection tool to check what characters the font supports.

               

              Gordon Smith

              Adobe Flex SDK Team

              1 person found this helpful
              • 4. Re: Solution for encoded UTF-8 special characters
                tooMuchTrouble Level 3

                if you're seeing question marks then your data is garbaged by bad encoding.

                either your db's encoding is not set to UTF-8 or your db driver can't handle

                unicode or is improperly setup.

                1 person found this helpful
                • 5. Re: Solution for encoded UTF-8 special characters
                  Muzak Level 3

                  if you're seeing question marks then your data is garbaged by bad encoding.

                  either your db's encoding is not set to UTF-8 or your db driver can't handle

                  unicode or is improperly setup.

                  Or the middleware (PHP, ASP, or whatever it is you're using) is messing it up.

                  • 6. Re: Solution for encoded UTF-8 special characters
                    PeakDigital Level 1

                    First, a clarification to my reply to Gordon Smith - the confusion over characters showing up as ? came later.  My original purpose in converting certain characters to UTF codes was to avoid problems with XML from characters like < and &.

                     

                    I got a little confused on my reply since I had 2 threads on the board both related to UTF encoding. Other one is at http://forums.adobe.com/thread/457736

                     

                    ===========

                     

                    The existence of Unicode characters in my font is still a point of confusion. I have pasted a few of the UTF-8 characters in the U+2100 range into text editors and specified to display all text in Verdana font, and the characters show up. Then after Gordon's recommendation I opened Mac OS's Character Palette and looked at that UTF-8 character range, and it does not show Verdana as having those characters available. I wonder if the text editors are doing some kind of background font replacement?

                     

                    Next step was to embed a different font into my Flex app. I chose Arial since the Character Palette said it did have the Unicode range I was interested in, and I was able to paste the arrow characters into my Flex app. So, I guess that is the final verdict - the Verdana font doesn't have the glyphs I need even though they show up in text editors in Verdana.

                     

                    PaulH and Muzak, you also caught a contributing factor to the problem. Even though my MySQL tables are all set to UTF-8 encoding, the varchar fields within them were set to latin1. Not sure how that occurred, but I'll have to review my entire DB now and make sure there are no more latin1 settings hiding in there.  I don't know whether to blame this on Navicat, the MySQL server or if I have to accept it as a PEBKAC issue.

                     

                    Thanks all for your time and input.

                    Paul

                    • 7. Re: Solution for encoded UTF-8 special characters
                      Steve4Flex

                      I had use some other font which is supporting unicode likes Lucida sans unicode.

                      but still some symbols like (™) displayed on Flex as '?'