2 Replies Latest reply on Sep 12, 2011 4:55 PM by NCPotter

    Problems importing bookmarks plus possible UTF-8 compliance issue


      I am using RoboHelp8 to upgrade our department's documentation.   I have something in the range of 250 word docs which contain between 10 and 20 bookmarks each.  Initially I attempted to import the word docs directly (MS Word 2010).  Unfortunately the bookmarks were dropped on the import.   The bookmarks were simple bookmarks - no HTML references.   I checked all of the boxes on the edit conversion page but to no avail.   The bookmarks are basic characters with underline word separators.   Having failed at all attempts to get past this problem  I decided to convert all the docs to HTML.   I imported all the HTML files and, success, my bookmarks were present and showing up in the TOC.    Unfortunately all the non breaking spaces (nbsp) had been converted to the UTF-8 equivalent of &#160 but with a question mark in front  ( ?&#160 ) so now my output is splattered with unwelcome question marks.     I am guessing this is some sort of compliance issue with UTF-8 but I can find no relevant configuration options either in Word or RoboHelp


      So I have two problems - the solution to either which will get me on my way - but it would be nice to know solutions for both

      1.  How do I import a word doc into RoboHelp 8 and have the bookmarks transfer successfully?

      2.  How can I import HTML docs that will not convert all my &nbsp characters into question marks? 

        • 1. Re: Problems importing bookmarks plus possible UTF-8 compliance issue
          Peter Grainge Adobe Community Professional (Moderator)

          Clean imports from Word in the sense of squeaky clean HTML are near impossible but you should be able to get a decent import. It is important that in Word you Save As Web Page (Filtered) rather than just Save As Web Page.


          Somehow I doubt that is going to solve this one so the next thing might be a decent find and replace tool such as FAR to seach for ?&#160 and replace it with the correct code.


          In the Importing topics on my site I recommend importing each document into a new project for that document. When you have done any cleaning up, then import the topics into the real project.That way if there are any problems you can trash just the temporary project rather than mess up the one with 100 documents already imported.


          I am working on a new topic to cover importing into RoboHelp 8 and 9 but a lot of the information you will find is still valid.


          See www.grainge.org for RoboHelp and Authoring tips



          • 2. Re: Problems importing bookmarks plus possible UTF-8 compliance issue
            NCPotter Level 1

            Thanks for the prompt response Peter and apologies for my tardy response - priorities and all that    Digging a little deeper I found issues that I ought to have identified beforehand.  The first is that the default encoding setting on for web options on Word was not set to UTF-8.  Changing that setting got rid of the spurious question marks.  Before I discovered that I did write a simple VB macro to replace the ?&#160 with &#160 which also worked.    The formatting wasn't helped by the fact that the word docs were themselves derived from primitive text files so recognizable formatting was negligible.  In order to get the bookmarks to import I still had to go the route of converting the word files to HTML beforehand.  Thank you once more.