32 Replies Latest reply on Dec 18, 2008 2:54 PM by Peter Grainge

    Unicode character set

    HockeyTime Level 1
      We are using RoboHelp 7 (WebHelp) to create and generate our help system. We have been using RoboHelp for a very long time but this is the first time we've generated help with 7.

      There is a problem with the display of certain characters, dashes, apostrophes, and so forth when the help is accessed through our product from a host server. (They are okay when displayed on my machine...of course.)

      We have never had this issue before. After some research, I found that RoboHelp 7 now supports unicode utf 8. This seems to be at the root of our problem. One of the discussion threads gave instructions on changing the browser display options. This resolution would have to be given in our documentation and I'm afraid our users might not catch this little nuance. Is there another solution? Can I reset what is being generated somehow through RoboHelp? Through a setting in the generated HTML? Any help would be appreciated.
        • 1. Re: Unicode character set
          Peter Grainge Adobe Community Professional (Moderator)
          Not sure.

          Take a look at item 22 with your developers. Could that be relevant?

          http://www.grainge.org/pages/authoring/rh7/using_rh7.htm

          See if that fixes it for you before we look at the issues re delivering to customers.

          • 2. Re: Unicode character set
            HockeyTime Level 1
            Thanks. I will work with our html guy and see if this information helps. However, I did generate the help with the "Apply Mark of the Web" selected. I'll let you know.
            • 3. Re: Unicode character set
              HockeyTime Level 1
              I looked into the mark of the web thing. Based on what I read, for the mark of the web to work there is suppose to be an entry immediately after <!doctype consisting of <!-- saved from url=(0014)about:internet --

              This entry is in my generated help. So in theory, the issue should be fixed. But it isn't.


              The forum entry also says that using the Mark of the Web setting adds a BOM which is basically what fixed the problem for the writer using French. However, I found something else where that said that "However, in Unix-like systems (which make heavy use of text files for file formats as well as for inter-process communication) this practice is not recommended, blah, blah, blah. I think that this may be the issue. But I still don't know the answer to the problem.
              • 4. Re: Unicode character set
                Peter Grainge Adobe Community Professional (Moderator)
                When the help is being view from the product, are you looking at it on your machine or another one?

                • 5. Re: Unicode character set
                  HockeyTime Level 1
                  I am looking at the product from my machine. The only place I'm looking at the files is from my machine. If I view them locally or from the our network the characters are fine. It's only when I look at them from the hosted server that they are altered.
                  • 6. Re: Unicode character set
                    Peter Grainge Adobe Community Professional (Moderator)
                    OK. The hope was that when viewed from the hosted server, they were being viewed on a different PC without the required fonts being installed.

                    It really does look as if the cause is what had been identified but regretfully I don't have any further suggestions.

                    I'll make some enquiries and post back if I get anything. It's not likely to be quick though.

                    • 7. Re: Unicode character set
                      HockeyTime Level 1
                      Beleive me quick or slow, any help will be appreciated! I'm not sure what we're going to do in the short term. I put a call in to the Adobe help desk but that wasn't a really big help. I may try again.

                      Thanks again for your help.
                      • 8. Re: Unicode character set
                        Peter Grainge Adobe Community Professional (Moderator)
                        HockeyTime

                        Can you let me have some information about the server, Windows / Unix, operating system etc.

                        Also how do you publish to the server?
                        • 9. Re: Unicode character set
                          HockeyTime Level 1
                          Our server is a Sun T2000 server. It runs on Solaris which is the Sun version of Unix.

                          We generate the help and then copy the files to the server. The help is displayed through a "shell" which connects the help topic to the application via a help ID number.
                          • 10. Re: Unicode character set
                            Peter Grainge Adobe Community Professional (Moderator)
                            It could be the shell not handling the characters properly. Can you let us have more information about that?

                            • 11. Re: Unicode character set
                              HockeyTime Level 1
                              I got an answer back from RoboHelp Support (oxymoron). Here is what they said.

                              "I understand that you are still having an issue viewing a Webhelp output
                              on a Unix server.

                              This is a known issue, please follow the suggested work around below:
                              1. Open each topic in HTML view.
                              2. Choose Edit> Replace, and replace the wrong characters with the
                              correct HTML code.
                              3. You can also right-click on the page you are editing and choose
                              Replace."

                              Thank you for the time you took to look at this. Obviously I've hit a bug and the only solution is manually changing the code.
                              • 12. Re: Unicode character set
                                Peter Grainge Adobe Community Professional (Moderator)
                                Contact me via my site please.

                                • 13. Re: Unicode character set
                                  Robo_Steve
                                  The browser is mis-interrupting the page. I think you can add the following directive to tell the browser which charset you are sending.


                                  <html>
                                  <head>
                                  <meta http-equiv="Content-Type" content="text/html; Charset="UTF-8"/>
                                  • 14. Re: Unicode character set
                                    HockeyTime Level 1
                                    Thanks but that is already part of the generated help topic. I appreciate the thought!
                                    • 15. Re: Unicode character set
                                      HockeyTime Level 1
                                      Here's the answer as I know it.

                                      1. Our help system is a legacy system and was originally created in RoboHelp for Word.
                                      2. We converted it to HTML with X5. All looked good.
                                      3. With the advent of Robo7, apostrophes, em dashes, and en dashes, displayed "oddly".
                                      4. Only the "old" characters displayed oddly. Any new apostrophes and so forth were fine.
                                      5. RoboHelp for Word had a smart quotes feature. I think this might have been at the root.
                                      6. Once we did a search and replace (3rd party tool) for the old odd characters, everything was good to go. New characters displayed correctly, old characters displayed correctly.

                                      I think there's an issue with the conversion from X5 to Robo7 but at least I can now "fix" it and go forward.
                                      • 16. Re: Unicode character set
                                        uknich
                                        The bullets and en dashes in my procedural help topics are displaying as foreign diacriticals next to a cent sign. It looks like both bullets and en dashes are changed to the same set of wrong characters. I think this is the same or similar problem to this earlier post, but the solution in HockeyTime's 08/08/08 post isn't working for me.

                                        The project is new in RoboHelp 7 WebHelp (not imported from RH6) and I use Dreamweaver MX as an editor. I wrote the topics with &#8226; for bullets -- which was the code I used successfully in RH6. After importing the .htm topic created in DW into the RH7 project, RH changed the bullet HTML code to a bullet graphic -- even in the HTML view. While the topics look good on my local Windows XP, the version accessed on my XP from the Apache server shows wrong characters.

                                        I used the RoboHelp multi-topic search tool to change the bullet graphic to the Alt+0149 bullet character that I was told is the unicode bullet value. The project displays on my Windows XP with correct characters, but after it's deployed to the Apache server, the bullet and en dash characters display as foreign characters.

                                        The network guy says the Apache server where the Help is deployed has the htpp compound set to support both utf-8 and iso-8859-1. And he says that when he attempts to open the Help with Firefox 3.03, he gets a blank page. (That's definitely a separate problem and for consideration after I get this wrong character stuff fixed.)

                                        I notice in Dreamweaver the meta tag is initially content="text/html; charset=iso-8859-1" but generating in RoboHelp 7 changes that to content="text/html; charset=utf-8". RoboHelp appears to be in total control.

                                        What do I need to do to make the RoboHelp 7 project display properly on the Apache server where Web content is delivered to our clients?

                                        thanks for your help!!
                                        • 17. Re: Unicode character set
                                          Peter Grainge Adobe Community Professional (Moderator)
                                          Uknich

                                          I had a problem where the page was not blank, it showed what are known as BOM characters. I expect your network guy will understand.

                                          The guy who hosts my site sent me this.

                                          I would therefore conclude that the solution to this problem (on Linux systems running Apache) is to add the AddDefaultCharset utf-8 directive to either the Apache config or the site .htaccess file. The advantage of the latter is that it only affects individual sites). The default Apache character set is taken from the locale file on Linux and defaults to iso-8859-1. It is the conflict between the Apache header with iso-8859-1 and the page character set of utf-8 that obviously causes Firefox a problem.

                                          Hope it helps.

                                          • 18. Re: Unicode character set
                                            uknich Level 1
                                            Thank you, Peter, for the response and suggestion! The network guy did investigate the .htaccess file for me today and says we don't use that file. His team lead then called to say the corporation's Apache servers are a standard configuration and default to ISO 8859-1 character sets. The team lead says they will not change the configuration and are not prepared to support at the tool level.

                                            So, that said.... Is there any way to change the RH7 setting to generate WebHelp deliverables in ISO 8859-1 character set instead of UTF-8?

                                            And where can I find the unicode values to use for bullets and en dashes. I need to verify that what I was told is correct. The search and replace I did in the RH7 topics was to change bullets to Alt+0149 and en dashes to Alt+0150.

                                            This problem is spreading and another Web application Help system that I converted from RH6 to RH7 is displaying wrong characters for colons, apostrophes, em dashes as well as bullets and en dashes.

                                            Your help is much appreciated!
                                            Thanks,
                                            karen
                                            • 19. Re: Unicode character set
                                              Peter Grainge Adobe Community Professional (Moderator)
                                              RH does not offer any other encoding. If the team lead will not assist you, then it seems your line manager needs to escalate the issue as I don't know of any other solution. Bear in mind, I cannot guarantee what I have said is the solution for you merely that it solved the problem for me.

                                              http://www.w3schools.com is a good place to look for the entities.
                                              • 20. Re: Unicode character set
                                                Amebr-ke0mH4
                                                Here's the Unicode page with handy charts:
                                                http://unicode.org/charts/

                                                Although it might be complicated by windows-1252:
                                                http://en.wikipedia.org/wiki/Alt_codes
                                                http://code.knopok.net/alt-codes.html
                                                I think this says Alt+0150 is windows-1252/ANSI rather than UTF-8, but I start getting a bit fuzzy on the details about here.

                                                These html character entities might also help:
                                                http://www.w3.org/TR/REC-html40/sgml/entities.html

                                                A discussion on wikipedia about UTF-8:
                                                http://en.wikipedia.org/wiki/UTF-8

                                                Hope this helps
                                                • 21. Re: Unicode character set
                                                  ksnichols
                                                  I and my manager have escalated and the network people are unwilling to use the fix (even though it does work) because it may compromise security. I also talked with RoboHelp Tech Support and they claim no one else has encountered a problem with character sets. The best solution for me would be a more flexible RoboHelp.

                                                  So, I now need to take RH7 off my department's computers and put RH6 back on. I remember reading that they cannot run parallel. Is that still true for v7.02? What other precautions (besides project backups) should I take?

                                                  Just FYI, the network folks recommended that we switch to a tool other than RoboHelp -- which I've used since 1993.

                                                  thanks for all your suggestions -- including any for downgrading.
                                                  • 22. Re: Unicode character set
                                                    HockeyTime Level 1
                                                    I finally decided that this was a bug and nothing but a bug. The functionality that worked before no longer works, which in my mind means a bug. So I went to http://www.adobe.com/cfusion/mmform/index.cfm?name=wishform&product=38 to log a bug. I got a response from the technical support person there and am very hopeful that he will be able to help me. (But I've been hopeful before). I had previously tried going through Customer Service but they just kept asking for sample projects and telling me this was a known problem.

                                                    I feel your pain. I've tried all the suggestions presented here and just keep hitting walls. Unfortunately, we're in the middle of a huge project and can't go back to the previous version. Good luck to you.
                                                    • 23. Re: Unicode character set
                                                      Peter Grainge Adobe Community Professional (Moderator)
                                                      ksnichols - are you uknich in different clothes?

                                                      Nobody can go back as the code changed in 7. The only way back is the backup you took before upgrading and making all the subsequent changes again.

                                                      If you try opening the project using the HHP, you may think I am wrong as you can edit topics. Wait until you save and close the project.

                                                      This has been logged but you should log it as Hockey Time has. The more people who report it, the more likely a fix.

                                                      Sounds like more escalation is needed given that you acknowledge the fix does work. Maybe someone higher up the network chain will take the view the security risk is low. I am not saying it is, I don't know. What I am getting at is whether or not IT are just saying any change is a compromise without really knowing. Surely if someone can get far enough into your network to view the help, there are bigger issues? Just tossing out some discussion ideas to help you.

                                                      • 24. Re: Unicode character set
                                                        uknich Level 1
                                                        Yes, uknich and ksnichols are the same person -- same job, different corporate owners over the years...

                                                        I get that I can't take an RH7 project back to RH6. I was trying to ask if there was anything to watch for when I uninstall RH7 and reinstall RH6? I do have the RH6 projects in my backup archive. And I have already rebuilt a project started in RH7 in RH6.

                                                        Can I run RH6 and RH7 on the same computer? Has that changed? Is there any gotcha with uninstalling and reinstalling?

                                                        And I hear you about escalating further -- but my last hope for support at the executive level did not step up this week. Our latest owners don't exactly embrace the user assistance model for client-facing software. I feel like I'm doing the same marketing to sell the concept that I did over 10 years ago. :-)
                                                        • 25. Re: Unicode character set
                                                          Level 7
                                                          On Wed, 29 Oct 2008 21:10:00 +0000 (UTC), "Peter Grainge"
                                                          <webforumsuser@macromedia.com> wrote:

                                                          >Uknich
                                                          >
                                                          > I had a problem where the page was not blank, it showed what are known as BOM
                                                          >characters. I expect your network guy will understand.
                                                          >
                                                          > The guy who hosts my site sent me this.
                                                          >
                                                          > I would therefore conclude that the solution to this problem (on Linux systems
                                                          >running Apache) is to add the AddDefaultCharset utf-8 directive to either the
                                                          >Apache config or the site .htaccess file. The advantage of the latter is that
                                                          >it only affects individual sites). The default Apache character set is taken
                                                          >from the locale file on Linux and defaults to iso-8859-1. It is the conflict
                                                          >between the Apache header with iso-8859-1 and the page character set of utf-8
                                                          >that obviously causes Firefox a problem.
                                                          >
                                                          > Hope it helps.
                                                          >
                                                          >

                                                          Hi Peter!

                                                          If the topic declares this (X3):

                                                          <meta http-equiv="content-type" content="text/html;
                                                          charset=windows-1252">

                                                          Why would the browser change it to UTF-8? Changing the browser to use
                                                          1252 fixes everything.
                                                          • 26. Re: Unicode character set
                                                            Peter Grainge Adobe Community Professional (Moderator)
                                                            David

                                                            This thread is about RH7 issues and RH7's metatag will define unicode. You seem to be using X3 which would have defined 1252 in the metatag. I am not clear why you are raising an X3 problem in a RH7 thread.

                                                            • 27. Re: Unicode character set
                                                              HockeyTime Level 1
                                                              Hi David,

                                                              Changing the browser to 1252 fixes only that specific browser session. Peter is right though, this is only a RH7 problem.

                                                              However, RoboHelp does have help in an Encoder tool that allows you to save the generated HTML files to a 1252 encoding AFTER they are generated. Anyone interested might want to contact the technical support people through the http://www.adobe.com/cfusion/mmform/index.cfm?name=wishform&product=38.

                                                              • 28. Re: Unicode character set
                                                                Level 7
                                                                On Thu, 18 Dec 2008 07:06:59 +0000 (UTC), "Peter Grainge"
                                                                <webforumsuser@macromedia.com> wrote:

                                                                >David
                                                                >
                                                                > This thread is about RH7 issues and RH7's metatag will define unicode. You
                                                                >seem to be using X3 which would have defined 1252 in the metatag. I am not
                                                                >clear why you are raising an X3 problem in a RH7 thread.
                                                                >
                                                                >

                                                                Sorry, I was just curious to know what was causing it. I have been
                                                                out of the RH loop now for several years and I'm trying to get back in
                                                                the swing of things.
                                                                • 29. Re: Unicode character set
                                                                  Level 7
                                                                  On Thu, 18 Dec 2008 12:48:56 +0000 (UTC), "HockeyTime"
                                                                  <webforumsuser@macromedia.com> wrote:

                                                                  >Hi David,
                                                                  >
                                                                  > Changing the browser to 1252 fixes only that specific browser session. Peter
                                                                  >is right though, this is only a RH7 problem.
                                                                  >
                                                                  > However, RoboHelp does have help in an Encoder tool that allows you to save
                                                                  >the generated HTML files to a 1252 encoding AFTER they are generated. Anyone
                                                                  >interested might want to contact the technical support people through the
                                                                  > http://www.adobe.com/cfusion/mmform/index.cfm?name=wishform&product=38.
                                                                  >
                                                                  >

                                                                  Thanks!
                                                                  • 30. Re: Unicode character set
                                                                    Peter Grainge Adobe Community Professional (Moderator)
                                                                    Is that definitely the correct link please? It is for reporting bugs and normally there would be no response. I would have expected to have to get the encoder tool from Tech Support or suchlike.

                                                                    • 31. Re: Unicode character set
                                                                      HockeyTime Level 1
                                                                      That is the correct link or at least the one that I used. I reported it as a bug, which it is. The problem is something that once but no longer does, which is a definition of a bug. Or at the very least, lost functionality. When I reported it, I got a response from a wonderful person who then helped me with the problem. I now have a "tool" that allows me to save the generated files to the correct encoding. Not the very best solution perhaps but it does save me from doing a search and replace on 3,000plus files each time I compile.

                                                                      • 32. Re: Unicode character set
                                                                        Peter Grainge Adobe Community Professional (Moderator)
                                                                        Thanks. It is unusual to get a response but if that is what works, that is what works.