• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

HTML Help worked in RH7 but not in RH8

Guest
Sep 15, 2009 Sep 15, 2009

Copy link to clipboard

Copied

I have a sizeable HTML Help project that worked fine in RH7. The raw HTML files are installed into an application that uses the SAX parser to parse the HTML. This all worked correctly in RH7. After upgrading to RH8, the same HTML files installed into the application now fail with the following error message: "org.xml.sax.SAXParseException: Content is not allowed in prolog."

Upon examination of the same HTML file that works in RH7 but not in RH8, we note the following differences:

In RH7, the HTML preceding the first <head> token is:

<!doctype HTML public "-//W3C//DTD HTML 4.0 Frameset//EN">

IN RH8, the HTML preceding the first <head> token is:

<?xml version="1.0" encoding="utf-8" ?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3/org/1999/xhtml">

We have verified that we if manually modify the RH8 file to look like the RH7 file as above, it works in the application.

Is there a setting somewhere in RH8 that I need to change?

Any suggestions will be greatly appreciated, but please don't tell me to manually modify the HTML in 300+ help topics.

Bob Boller

Views

4.3K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 16, 2009 Sep 16, 2009

Copy link to clipboard

Copied

Hi Bob.

I think you may have to change all 300+ files but you could easily do this with a find and replace tool like BkReplacem or FAR. Both are excellent for this type of thing.

The problem here is that you are using the raw topic files and effectively generating the output outside of RH. If you are doing this you can't expect Adobe to support the SAX parser. RH8 uses XHTML as opposed to RH7's HTML and upgrades each topic when the project is first opened. I can't see any way back unless there is something on the SAX parser side to allow for XHTML.


Read the RoboColum(n).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 16, 2009 Sep 16, 2009

Copy link to clipboard

Copied

Hi,

Changing the doctype of your output files from XHTML to HTML might not be such a good idea. XHTML has a (very slightly) different syntax then HTML and changing the DTD may have unforeseen consequences, although it will probably work for most browsers. In any case, your output will no longer be 'valid' as you will be using some incorrect syntax for HTML. See http://www.w3schools.com/XHTML/xhtml_html.asp for an overview of the difference between HTML and XHTML.

I don't know anything about the SAX parser, but I agree with Colum that the only (and probably the best) way is to get the parser to work with XHTML.

Greet,

Willam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 18, 2009 Sep 18, 2009

Copy link to clipboard

Copied

Well, here is some more insight into this problem. The editied HTM files in RH8 are in XHTML whereas in RH7 and earlier, they were in HTML. As a result, in RH8 threre is what is known as a signature at the beginning of each HTM file that specifies the character encoding. If you use UTF-16 encoding, then you must include a byte order mark to indicate whether the encoding is big endian or little endian. If you use UTF-8 encoding, then the byte order mark is not needed. There seems to be a difference of opinion between the XHTML standards folks the Java folks as to whether specifying a byte order mark with UTF-8 constitues invalid syntax. The XHTML say it is not invalid syntax, the Java folks say it is invalid syntax. What is happening in my situation is that the application using the HTM files for on-line help uses the Java built-in parser and it rejects the UTF-8 byte order mark as invalid.

I have found a work-around. For my project, I am generating HTML Help. Instead of using the RoboHelp edited HTM files, I use the Microsoft HTML Help Workshop to decompile the generated chm file. Before generating the chm file, I check the check box labeled "Convert RoboHelp edited topics to HTML" on the General tab of the Options dialog in RoboHelp. This results in the HTM files in the chm being in HTML rather than XHTML. With these files, the Java parser is happy.

I don't know if we should raise this issue with Adobe. Any thoughts?

Bob Boller

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 20, 2009 Sep 20, 2009

Copy link to clipboard

Copied

Hi Bob

Interesting information. I will bring your post to Adobe's attention.

Turning to your problem, I have seen posts before where tools used by developers whinge about the output and the problem is the tool, not the help. I am not saying that is the case here but maybe some searching outside this forum might reveal something. Whilst the parser fails the output, does it nonetheless work with the application?

You say

The raw HTML files are installed into an application...

Later you say

For my project, I am generating HTML Help. Instead of using the RoboHelp edited HTM files, I use the Microsoft HTML Help Workshop to decompile the generated chm file. Before generating the chm file, I check the check box labeled "Convert RoboHelp edited topics to HTML" on the General tab of the Options dialog in RoboHelp. This results in the HTM files in the chm being in HTML rather than XHTML.

To me, raw HTML files means the source files before RH has done any processing. That conflicts with the later statement you are generating a CHM.

I am also not clear on the process as you seem to be using the Help Workshop to create a CHM, then decompiling it, then using the source files that creates to get RH to create another CHM. Why not have RH create the CHM with the Convert to HTML mar set?


See www.grainge.org for RoboHelp and Authoring tips

Help others by clicking Correct Answer if the question is answered. Found the answer elsewhere? Share it here. "Upvote" is for useful posts.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 21, 2009 Sep 21, 2009

Copy link to clipboard

Copied

Peter - When I used the term "raw HTML", I was referring to the RH-edited

files. The check box "Convert RoboHelp edited topics to HTML" seems to

affect only the HTML that is compiled into the chm. It does not change the

RH-edited files, they remain in XHTML. I am decompiling the RH-produced chm

to get to the HTML files inside. I cannot provide the chm file to the

application as it uses the built-in Java parser and, thus, isn't able to

decompile the chm file.

Bob

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 21, 2009 Sep 21, 2009

Copy link to clipboard

Copied

So why not generate webhelp with that checkbox ticked? That will give you output files which is surely what your developers want?


See www.grainge.org for RoboHelp and Authoring tips

Help others by clicking Correct Answer if the question is answered. Found the answer elsewhere? Share it here. "Upvote" is for useful posts.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 21, 2009 Sep 21, 2009

Copy link to clipboard

Copied

Peter - That's what I'm doing. It just that only the files compiles into the

chm are in HTML. The RH-edited files remain in XHTML.

Bob

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 21, 2009 Sep 21, 2009

Copy link to clipboard

Copied

I did a quick test, and generating webhelp with the setting ticked, my topics get the following doc type:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

Just to clarify, the setting only applies to the output files (i.e. the conversion takes place as part of the Generation process), not the source files that you edit in RH.

So if you output to Webhelp, you won't need to to the decompile step in HTML Help Workshop - all your topics will be in HTML.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

The problem is not the doctype, the problem is the signature -

<?xml version="1.0" encoding="utf-8" ?>

Specifically, it's the three characters following the "utf-8", known as the

byte order mark (BOM). The BOM is required for UTF-16, but is not needed for

UTF-8. The disagreement seems to center on whether having a BOM with UTF-8

constitutes invalid syntax in XHTML.

Bob

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

Hi,

A html file only uses the XML declaration for XHTML. The XML declaration is not required but encouraged by the W3C, see http://www.w3.org/TR/xhtml1/#docconf. Since the DTD is no problem, I can think of two thing:

- Remove the XML declaration from your output files. The Byte Order Mark problem should disappear, as long as you only use UTF-8 or UTF-16 encoding.

- Output your files as HTML. 'Regular' 4.01 html doen't use the XML declaration, regardless of the DTD.

Greet,

Willam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

LATEST

Hi Bob,

I should have been more clear. When I generate to webhelp with the 'convert to html' setting ticked, the start of the file is the doctype declaration I posted above. The '<?xml' tag was not included.

Amebr

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

I don't think you are doing what I suggested. To create webhelp, you use a different layout. CHMs are generated from RH using the Microsoft HTML layout and you get one file. To create WebHelp, you use the WebHelp layout and you get the HTML files that your developers want as a whole bunch of separate HTML files. If you use the Convert to HTML option, they will be what you want.

Tell me if I am misunderstanding you but you seem to be going around the houses to get HTML when you can get RH to do it for you.


See www.grainge.org for RoboHelp and Authoring tips

Help others by clicking Correct Answer if the question is answered. Found the answer elsewhere? Share it here. "Upvote" is for useful posts.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

Peter - Well, I tried that and now when I plug the HTML files into the

application, I get the error message shown in the attachment on every topic.

I noticed that for every topic I looked at, the line number was several

lines beyond the last line of the HTML file. I guess that suggests there is

something missing that the HTML parser is looking for.

I also notice that in the portion of the HTML line called the signature,

that is, the part that specifies the character set, RH did not include a BOM

for UTF-8, whereas in the RH-8 edited XHTML files, the BOM is present.

Bob

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

Hi,

There's no attachment to your post.

Greet,

Willam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

Willam, I replied via email and it was there on the email. I guess it got stripped off somewhere. Here it is.

Bob

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

Hi Bob

In glancing at this thread a question forms in my mind.

It would seem that all you want is a basic editor to regurgitate raw HTML files that are used by your application in a way that avoids any features really added by RoboHelp.

So my question is to ask quite simply: Why are you using RoboHelp for this?

Is it simply because you are accustomed to its interface? I might think that if it's causing issues for you in working with your application, you might find it simpler to move to a more simplistic basic HTML editor. Perhaps Dreamweaver? CoffeeCup HTML Editor? There are many different HTML editors around.

Note that RoboHelp HTML is very near and dear to my heart, so suggesting a move from it doesn't come lightly. But if you aren't really utilizing its output, I'm not seeing what the purpose is. It would seem to compare to using a compound miter saw with laser guides to slice the butter for your bread.

Cheers... Rick

Helpful and Handy Links

RoboHelp Wish Form/Bug Reporting Form

Begin learning RoboHelp HTML 7 or 8 within the day - $24.95!

Adobe Certified RoboHelp HTML Training

SorcerStone Blog

RoboHelp eBooks

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

Rick - I use RH for two primary reasons, the WYSIWYG editor so I don't have to mess with HTML and so I can generate a User Guide from the on-line help that I know is content-identical to the on-line help - write once, publish twice.

Bob

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

By the way, in case anyone is wondering, the application I'm writing the help for is not some off-the-wall, wierd application. It is a plug-in to Eclipse 3.4, which is used by many thousands of users. And yes, the handling of the help files is done by Eclipse, not by the plug-in, so the problem isn't in the plug-in.

Bob

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 22, 2009 Sep 22, 2009

Copy link to clipboard

Copied

Bob

RH8 has a script for producing Eclipse Help. Maybe that is what you should be using?

My topic About RH8 has a link to the RH8 Reviewers Guide where there is mention of it. My topic also has a link to my RH Tour where there is a basic reference.

Beyond that, I don't think I can suggest anything else on this one.


See www.grainge.org for RoboHelp and Authoring tips

Help others by clicking Correct Answer if the question is answered. Found the answer elsewhere? Share it here. "Upvote" is for useful posts.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 20, 2009 Sep 20, 2009

Copy link to clipboard

Copied

I have a different, but possibly related problem.

I developed several WebHelp systems with Madcap Flare, only to discover that when served by our product's homegrown HTTP 1.1 web server, though they worked fine for FireFox, they did not work with IE 7 or IE 8. The problem is that our web server, seeing the Madcap output files have a .htm extension, sends a MIME Type of text/html with the response. Intenet Explorer, apparently, inspects the file itself for the MIME Type, and sees the the files are actually XML files, starting with this:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Now, we could adjust our web server--apparently Apache and IIS have done so--but instead I've been told to migrate my help systems to a tool that works with our web server.

So I downloaded a trial version of RoboHelp 8 and created a test system to be sure RoboHelp works. Unfortunately, it has the same problem, and this is not surprising since the RoboHelp 8 output files also begin with:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

My first question is (and from what I've read I think I know the answer), does RoboHelp 7 have this <?xml...> and XHTML prolog?

My second question, assuming the answer to the first is No, is how easy is it to obtain RoboHelp 7 at this point? Does Adobe insist that new licensees get RoboHelp 8?

Incidentally, with Madcap, I tried removing the "<?xml version="1.0" encoding="utf-8"?>" line from the output .htm files, but this did not solve the problem for Internet Explorer (so maybe Internet Explorer is looking somewhere else to conclude that these are XML files--maybe it's the DOCTYPE line, which I didn't remove). I haven't tried this for my RoboHelp 8 test system, but hesitate to do so because the developer that would test it doesn't have time to test things likely to fail.

Has anyone else run into this problem with non-Apache and non-IIS web servers? I can supply further particulars about it (for example, it works on IE if we turn compression off; but doing so would slow our system down too much, apparently).

Thanks,

- Willie

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 20, 2009 Sep 20, 2009

Copy link to clipboard

Copied

Your problem is sort of related.

The server has to be set up to recognise UTF8 and it sounds like the problem is that yours is not.

I had a problem where the output was OK in IE but only the BOM characters showed in Firefox. This is what I was advised by the company hosting my site.

"I would therefore conclude that the solution to this problem (on Linux systems running Apache) is to add the AddDefaultCharset utf-8 directive to either the Apache config or the site .htaccess file. The advantage of the latter is that it only affects individual sites. The default Apache character set is taken from the locale file on Linux and defaults to iso-8859-1. It is the conflict between the Apache header with iso-8859-1 and the page character set of utf-8 that obviously causes Firefox a problem."

In a forum post Chrissy_Tissy added

My machine is Windows, but this fix still worked  - some notes about making the fix visible:

1. Do the fix itself (httpd.conf: AddDefaultCharset utf-8).

2. Restart the box to apply the fix.

3. Once the box is restarted, clear your cache in FireFox to make sure you don't continue to see the cached file.

Once all this is done you will see the output content as expected.

I am wondering if your server can be amended in a similar way? If not, in RH8 look in Tools > Options and tick the options I have highlighted. See if that produces an output that will be agreeable to your server.

ConvertToHTML.jpg

Finally, if not, Adobe does have a tool that works on the output and changes the encoding to whatever you want. Trouble is it works on one folder at a time so it can be painful if you have many folders.

I would appreciate you posting back the solution you finally go for. It all helps us when people have similar problems.


See www.grainge.org for RoboHelp and Authoring tips

Help others by clicking Correct Answer if the question is answered. Found the answer elsewhere? Share it here. "Upvote" is for useful posts.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
RoboHelp Documentation
Download Adobe RoboHelp