17 Replies Latest reply on May 29, 2010 8:01 AM by David_Powers

    Accented characters in CS5

    garfieldsevilla

       

      Can anyone advise if the treatment of accented characters has improved in DW CS5? We work do quite a bit of work in Spanish and French and content mostly arrives in Word but when we paste this into DW CS3, we have to manually covert the accented characters to their HTML &acute equivalents. We tested CS4 and it was the same, so decided not upgrade. Options to “paste as is” or “paste in HTML format” would be great, as would site-wide checks for accented characters not in &acute format. No matter how many times we search and replace, we seem to always get a call from someone saying a character is currupted..

        • 1. Re: Accented characters in CS5
          Donald Booth

          Have you tried the Edit > Paste Special.. options?

          • 2. Re: Accented characters in CS5
            garfieldsevilla Level 1

             

            Sorry, we havent downloaded a CS5 trail yet- it really big.

             

            In CS3/CS4, the only "Special characters" that can be encoded with & are "< > & and *" via Paste Special/ Paste preferences.

             

            Are you saying there are more Special characters in CS5? In particular, we need often used characters like ñ or € to be pasted as &ntilde and &euro. And out course we need all the accented vowels.

            • 3. Re: Accented characters in CS5
              hans-g. Adobe Community Professional & MVP

              Hi,

               

              here we are for example: http://www.utexas.edu/learn/html/spchar.html, BUT be careful, some of them make your website needlessly complicated, because in the meantime there are grown several international standards.

               

              Hans-G.

              • 4. Re: Accented characters in CS5
                garfieldsevilla Level 1

                This list illustrates the problem nicely. Paste in CS3 simply pastes the ASCII character, we are hoping that CS5 will convert the character into the web standard equivalent using &.

                • 5. Re: Accented characters in CS5
                  hans-g. Adobe Community Professional & MVP

                  Hi,

                   

                  hoping, no, let's try it! Look here, I pasted the whole source code (really unmodified) of the list's website in CS5 and loaded it on my server, look here:

                  http://www.goldschmiede-blumberg.de/code.php, anything to object?

                   

                  Hans-G:

                  • 6. Re: Accented characters in CS5
                    garfieldsevilla Level 1

                     

                    Sorry, just to be clear, as you on DW CS5? The page you refer to is already encoded correctly:

                     

                    <th>&atilde; <hr width="50%"  noshade="noshade" />&Atilde;</th>

                     

                    Cut and paste on CS3 also works- there is nothing to modify. What we are looking for is to cut:

                     

                    <th>ñ; <hr width="50%"  noshade="noshade" />&Atilde;</th>

                     

                    And get DW to paste

                     

                    <th>&atilde; <hr width="50%"  noshade="noshade" />&Atilde;</th>

                     

                    Or to put it anotherway, type some accented text such as “úñó” into Word and paste it into DR. What you get in CS3 is “úñó” and not HTML standard “&uacute;&atilde;&oacute;” We have to change this manually in CS3.

                    • 7. Re: Accented characters in CS5
                      jxlusa Level 2

                      I'm not sure if this will get all the characters you are lookg for but it should get most of them.

                       

                      Dreamweaver CS4 (and I belive CS3 though I can't test that right now) both do in fact encode all the special characters in the tes sample  “úñó”.

                       

                      The problem is probably that you you have the Content-Type set to charset=iso-utf-8 instead of charset=iso-8859-1

                      UTF-8 is not supposed to require &# encoding. In modern browsers, the characters themselves are supposed to just work, so Dreamweaver does not encode them.

                       

                      If you change the preferences for new documents to latin1 Western European, instead of Unicode(utf-8) (or whatever you have it set to right now) it will encode the accented characters.

                       

                      Of course, if you paste directly into the code view, it will not encode the entities, you need to paste into the design view.

                       

                      To change the Content-Type, go to Edit > Preferences > New Document > Default Encoding

                      and change  Default Encoding to "Western European"

                       

                       

                      For any existing html pages, you will have to replace


                      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

                      with
                      <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />

                       

                      Close and reopen the existing html page and newly pasted accented characters will be encoded. I'm not sure offhand, but I think that if you do a search and replace on the text it will update the code to encoded characters.

                      • 8. Re: Accented characters in CS5
                        hans-g. Adobe Community Professional & MVP

                        Hi,

                         

                        I couldn't resistand I copied and pasted. The (CS5!) result (without any gimmicks) you will find here:

                        http://www.goldschmiede-blumberg.de/code01.php, I am sure now that's what you want.

                         

                        Hans-G.

                         

                        By the way and am I right? You can't use neither DW CS5 nor a DW CS5 trial version?

                        • 9. Re: Accented characters in CS5
                          hans-g. Adobe Community Professional & MVP

                          Hi,

                           

                          in addition and to explain my note No. 3: "... because in the meantime there are grown several international standards." here the corresponding thread, in which I asked something like you did:

                           

                          http://forums.adobe.com/message/2726645#2726645

                          David_Powers said there among other things: "The dollar sign ($) is accepted in all encodings. It's part of the original ASCII set of 128 characters, which are now used as the first 128 characters of Unicode. Unlike the symbols for the pound sterling and euro, it does not require conversion as an HTML or numeric entity."

                           

                          Hans-G.

                          • 10. Re: Accented characters in CS5
                            garfieldsevilla Level 1

                            Changing the charset to <meta http-equiv="Content-Type" content="text/html;  charset=iso-8859-1" /> does indeed result in accented characters being converted to HTML & codes (or whatever they are called) when pasted in CS3/4 and probably in CS5.

                             

                            It's a pity that you cant force DW to do this regardless of the encoding. A Spanish sentence with accents encoded with &acute ALWAYS displayes correctly, regardless of the encoding of the browser.

                             

                            If you create a simple page with "úñó and &uacute;&ntilde;&oacute;" and then change the char code in Firefox, the second word displays correctly in UTF-8, iso-8859-1, EUC-JP or whatever you choose.

                             

                            When working with European languages, the only way to be sure your text will display correctly is using the & code. So as CS5 also does not help much with this, the work around is to paste your accented Spanish/ French/ Portugese text in design or code view, save the page, modify page encoding to iso-8859-1 so that DW changes all the characters, then switch back to UTF-8.

                             

                            Does anyone know how to change the coding of ALL the pages in a site in one operation? If you could toggle all the page encodings, this would be an OK work around- better than search and replace for all the accented characters.

                             

                            Just realised there is a problem: if you have a template where you define the charset for all pages, you cannot simply toggle the page charset to get DW to convert accented characters in your page..

                             

                            Message was edited by: garfieldsevilla

                            • 11. Re: Accented characters in CS5
                              jxlusa Level 2

                              Yes, It's clumsy. The main problem as I see it, is that the standards in this regard are so convoluted, and browser support is both spotty and lags behind by it's very nature. I'm not really sure what the best way to handle this sort of problem is.

                               

                              Certainly you can make a feature request for future versions. opr possibly even an update of the current version. It seems like one of the easier features to add would be to put a choice in the Preferences box where you could add custom characters to the "Always Encode Special Characters" function. That wouln't really help you for a couple of months at best, and never at worst, but it's worth asking for for the future.

                               

                              For the dw template based site, I'm not seeing that problem. It seems like it would actually be easier to simply change the encoding once in the template, have it update all the pages, then do a text search and replace for single accented characters on the entire site. You could then just change the encoding back to utf-8 globally once it's done.-  don't even have to open any individual pages at all. I'm working with DW CS4.

                               

                              Any way you look at it, it's a pain. The right thing would be for all browsers to simply understand that if UTF-8 is specified in the doc, it shold read all the utf-8 characters corectly , encoded or not. That is what standards are for. What a nice dream I'm having.

                              • 12. Re: Accented characters in CS5
                                garfieldsevilla Level 1

                                “Always Encode Special Characters" option would be nice. Actually, I would name it “Always Encode non-ASCII characters" as ñ is part our culture and nobody thinks of it as a special character. For Italians, “J” and “K” are special characters- their standard alphabet has 21 letters.

                                 

                                Ñ and other accented characters do appear in UTF-8. The problem is if the user has their browser set to force the character encoding. Say a user has just visited a Greek or Japanese site which did not use the HTML & codes, so they set the browser encoding. When they come back to Spain, the accented characters to not display correctly UNLESS they are encoded with HTML &.

                                 

                                Our stance internally is that ALL non-ascii characters should be encoded with HTML & so that we can be 100% sure that the client's message will display on any visitor's browser all the time.

                                • 13. Re: Accented characters in CS5
                                  jxlusa Level 2

                                  There is already an "Always Encode Special Charaters" option already, as you mentioned an earlier (first?) post. But the characters considered Special, are limited to those that are part of HTML, that is <, >, and &. I might be foretteing one or two, but no more than that.

                                   

                                  Which is why I think it would be relativly easy for Adobe to allow users to simply add to that list as needed for their specific purposes.  You can't really just say ALL special characters should always be encoded without defining what they all are. In UTF-8 accented characters for instace are not special, they are simply part of the character set. It's the borwser implimentation that not up to par.

                                  • 14. Re: Accented characters in CS5
                                    garfieldsevilla Level 1

                                    jxlusa wrote:

                                     

                                    You can't really just say ALL special characters should always be encoded without defining what they all are.

                                    Why not? DW alreadys knows that certain special characters need to be encoded with &. We just need an option to force convertion on present page, selected pages, the whole site.. Perhaps the COMMAND menu would be a goog place as this is in effect a kind of formatting.

                                    • 15. Re: Accented characters in CS5
                                      David_Powers Adobe Community Professional

                                      Using HTML entitites, such as &ntilde; is not necessary if the page is encoded using UTF-8. If pages encoded as UTF-8 fail to display correctly in a browser, the fault lies in the way the web server has been configured. It sounds as though the web server is sending an HTTP header for iso-8858-1 encoding.

                                       

                                      However, if you want to configure Dreamweaver to convert accented characters to HTML entitites, the simple answer is to set Western European as the default encoding in Preferences. Any accented characters pasted into Design view (but not Code view) will be converted automatically. Unfortunately, changing the encoding of an existing page won't make the conversion.

                                       

                                      Because UTF-8 is being adopted as the universal encoding, I think it's highly unlikely that Dreamweaver will add the functionality to convert accented characters to HTML entities, as they are rapidly becoming unnecessary.

                                      • 16. Re: Accented characters in CS5
                                        garfieldsevilla Level 1

                                        Hi David, the problem is this:

                                        Ñ and other accented characters do appear in UTF-8. The problem is if the user has their browser set to force the character encoding. Say a user has just visited a Greek or Japanese site which did not use the HTML & codes, so they set the browser encoding. When they come back to Spain, the accented characters to not display correctly UNLESS they are encoded with HTML &.

                                         

                                        So the client calls us to say the page is corrupted, only it isnt but they are not really interested in the technical discussion. The only thing they remember is that our page didt work properly.

                                         

                                        Ultimately you are right, the problem will go away but not anytime soon. As long as other countries use local encoding in their web pages, they wont show up correctly in, say, firefox so the viewer will set the local encoding and may forget to reset autodetect. We dont really want to use iso-8858-1 just to fix the encoding. We can continue to search and replace as we do in CS3.

                                         

                                        I guess the original question is anwered and there is no need to trial CS5. We are going to check out MS Expressions to see if there is better international support.

                                        • 17. Re: Accented characters in CS5
                                          David_Powers Adobe Community Professional

                                          garfieldsevilla wrote:

                                           

                                          We are going to check out MS Expressions to see if there is better international support.

                                          I have just tested Microsoft Expression Web 3 by copying a paragraph from a French website into Design view. Even if you set the page encoding to iso-8859-1, it uses accents in the underlying code. What's more, it wraps the text in <font> tags!!

                                           

                                          Dreamweaver, on the other, hand does exactly what you seem to want. If you set the page encoding to Western European in Preferences, all accents are automatically converted to HTML entities.

                                           

                                          This is what I pasted into Dreamweaver Design view:

                                          Les                           vins de Provence ont une origine très                           ancienne. Les grecs, fondateurs de la bonne ville de                           marseille, y plantèrent les premiers ceps. César                           en parle dans ses mémoires lors de la conquète                           de la Gaule.

                                          This is what Dreamweaver automatically created in Code view:

                                          Les vins de <strong>Provence</strong> ont une origine tr&egrave;s ancienne. Les grecs, fondateurs de la bonne  ville de marseille, y plant&egrave;rent les premiers ceps.  C&eacute;sar en parle dans ses m&eacute;moires lors de la conqu&egrave;te de la Gaule.

                                          What neither Dreamweaver nor Expression Web can do is automatically convert accented characters once they have been inserted into Code view. However, if you set your default encoding to Western European, all accents in new pages will be converted exactly the way you want. Where you're going wrong is in switching back to UTF-8. If you want to use HTML entities, the correct encoding is iso-8859-1. If you want to use UTF-8, you don't need the entities.