This content has been marked as final. Show 12 replies
While I'm still puzzled by the behavior explained above, further testing has me wondering if the full text search is not working for us.
After importing the correct file and making a working link, and then regenerating everything, the topic still didn't appear in searches. Once I went back and linked it as an index keyword, however, it now appears in searches.
My understanding of the search feature in Microsoft HTML projects is that it is a full text search that looks at the body, not an index keyword search. Is this correct? And, if so, any ideas as to why our full-text search feature is not working properly?
> So I regenerated the project "as is", expecting that now the missing
> topics would no longer appear. This is where (for me), the real oddness
> begins. The topics still appear - the only apparent difference that
> regenerating the file did was that several screenshots disappeared (the
> text remains)!
If you right-click one of these ghost topics in the HTML Help viewer and then select Properties on the context menu, you can get the absolute path to the topic file from the Address (URL) field in the properties sheet. This should enable you to track down its source.
> My understanding of the search feature in Microsoft HTML projects is that
> it is a full text search that looks at the body, not an index keyword
> search. Is this correct?
Yes, that's right. The compiler parses the content of the individual HTML topic files and assembles a sort of database that maps each word to the topics that contain it.
Is it possible that not every module in the merged help collection is as recent as you're assuming, or that there are junk HTML files lying around your hard drive that the compiler is somehow pulling in to the help files?
Just to verify, this is an HTML help project and not a Webhelp project, correct? I do know that Microsoft made some changes recently to the HTML help format to fix some security vulnerabilities. The main change, as I understand it, is that HTML can no longer run from a networked location. The .chm file is linked to a local machine or installation. I don't know if that's your issue or not, but it could be that some files are stored or installed locally, and others are accessed remotely. Follow Pete's advice, and move all the files to a local directory, or rebuild the .chm file.
But on a different note, do you have a specific need for HTML help? I've found Webhelp to be much more flexible format, and doesn't have the restrictions that HTML help has. Webhelp doesn't require an internet connection per se, only a browser, and everyone has one of those.
Anyway, I hope you get your issue resolved. If you do, please post solution here for the benefit of the community.
> If you right-click one of these ghost topics in the HTML Help viewer and then select
> Properties on the context menu, you can get the absolute path to the topic file
> from the Address (URL) field in the properties sheet. This should enable you to
> track down its source.
I can see these files on my hard drive via Windows Explorer at the location shown in Properties, but they do not seem to exist in the project. Why would they show as broken links if they are in fact working external links?
> Is it possible that not every module in the merged help collection is as recent as you're
> assuming, or that there are junk HTML files lying around your hard drive that the compiler
> is somehow pulling in to the help files?
Both are possible, although Windows Explorer seems to do a good job of giving me creation dates for the .chm's. However, I'm not sure I follow how either would connect to the issue of whether the full-text search is actually searching text or merely searching index values - my understanding from the material in the help file is that the full-text search should work automatically as long as the search tab is selected (which it is for all projects).
Searching this forum, I see that someone else has a similar issue with full-text searching apparently only searching index keywords. Has anyone else had this issue? If so, any solutions? I am really leaning towards this being the reason behind why some of our topics are not appearing in searches, as all the trouble topics seem to be the non-indexed ones.
We have a problem we have to work around where the TOC shows a merged file, but search (and maybe the index, I can't remember off the top of my head) only shows topics in the Master project chm file.
This is caused by the paths being hardcoded in the [MERGED FILES] section. We think it's to do with source control (we use VSS)
Instead of the files appearing as
they appear as:
compile the help.
open the hhp file, delete the paths and save the hhp file.
recompile the help.
We haven't found a way to solve this permanently unfortunately, so have to remember the double-compile each time we update the master project.
Hopefully this is some use to you.
My apologies for asking such a noob question, but where do I find this "[MERGED FILES]" section? I'm not seeing anything in any of our project files (.xpj files) that appears to match this.
(we're using RH5 btw)
> Just to verify, this is an HTML help project and not a Webhelp project, correct?
> But on a different note, do you have a specific need for HTML help?
I can't really say at this point. This is the format I've inherited. My guess is they went with something relatively self-contained to avoid any potential issues, as my understanding is that some of our customer sites can be pretty remote and/or use fairly bare-bone systems.
Looks like this may solve the search issue.
Opening the .hhp in notepad, the [MERGE FILES] paths are all pointing to a set of .chms in a "Recovered" folder located on my computer (no idea what may have happened or what was involved with that), rather than the current .chms that I update.
One odd thing I'm seeing when I follow the instructions provided by Amebr above: when I delete the paths and regenerate the .chm, it's entering the new paths in the [MERGE FILES] section twice (i.e. doubling them).
Any ideas as to what may be causing that?
Although I think the search issue has been resolved (or resolved in large part), I'm still curious as to how the deleted topics are still appearing in the .chms.
My fear with these files is that there may be a lot of "debris" floating around the project files (for lack of a better term), and I've no idea how to begin trimming and stream-lining them to prevent (or at least limit) any future issues being caused by unmanaged / mismanaged linking and whatnot.
As such, I'm trying to understand how this material gets saved and linked when the obvious links (in the TOC) don't work or point in the wrong direction. Does anyone have good advice on this?
FWIW, the HTML Help compiler tries to pull into a .chm file all local files to which there is a link in one or more of these locations:
* An HTML topic file
* The FILES section of the project (.hhp) file
* The contents (.hhc) file
* The index (.hhk) file
So, for example, if one of your HTML topic files contains a hyperlink to another local HTML file, this second file will be compiled into the .chm file even if there is no reference to it in the .hhp, .hhc or .hhk file.
When there's any concern about incorrect source files being compiled into a .chm file, I think it's always a good idea to delete — or at least move to a different location — as much debris from the local drive as possible before compiling.
Thanks to everyone for your help. It looks like the search issue has been solved.
Two final quick questions:
1) Is this still an issue with RH7? Will I still need to follow Amebr's steps everytime I generate a new master .chm when we upgrade to 7?
2) How does this affect automatic building? To me it would seem to make it impossible to automate the build and still have the search function working properly, given the need to manually correct the hhp file.