12 Replies Latest reply on Mar 18, 2008 5:59 AM by Tech Writer KC

    odd issue involving missing topics

    Tech Writer KC Level 1
      EDIT: Sorry, I keep forgetting to specify this, but our help is an HTML-based help system.

      First, some background: I'm working with an inherited help file system that has been years in the making, so I don't know about its history and what has gone on in the past.

      Anyhoo, the issue: I've been researching an issue reported to me that certain topics were not turning up when being searched on. They can be accessed by navigating directly to them, but they do not show up in any searches.

      At first I thought it might be a merge thing, as our help system is spread across over 20 individual projects. However, that avenue of thought hasn't panned out.

      Meanwhile, in looking at the actual project itself, I discovered that most of the actual topic files appear to have been deleted. At first this seemed extremely odd, given that they still existed in the final help file, but after a quick second or two of being braindead, I realized that the project .chm is fairly old, and that there was a good chance that the topics were somehow lost or deleted after it was generated, which would explain why they're still appearing in the .chm despite being missing in the project.

      So I regenerated the project "as is", expecting that now the missing topics would no longer appear. This is where (for me), the real oddness begins. The topics still appear - the only apparent difference that regenerating the file did was that several screenshots disappeared (the text remains)!

      This is unexpected to me as the topic files are missing from the project and the links from the TOC are broken. Nothing appears to be pointing to an external source, nor have I found these files misplaced elsewhere within the projects / .chms that make up our help system.

      As such, my expectation is that nothing should be appearing in this section, as I can't figure out where it's getting the material when I generate it. Anyone have ideas as to what may be going on? At this point I figure it's a straight-forward matter of tracking down the original Word docs and reinserting them to update and fix the broken links (which in turn I hope means they'll appear in searches), but I'm really curious as to how this "ghost" material still exists in the new .chm, and if this possibly indicates any further issue I should be aware with (as again, I'm fairly in the dark with how the system was being built and mainted prior to me, and I've only just started here).

      Thanks in advance!
        • 1. Re: odd issue involving missing topics
          Tech Writer KC Level 1
          An update:

          While I'm still puzzled by the behavior explained above, further testing has me wondering if the full text search is not working for us.

          After importing the correct file and making a working link, and then regenerating everything, the topic still didn't appear in searches. Once I went back and linked it as an index keyword, however, it now appears in searches.

          My understanding of the search feature in Microsoft HTML projects is that it is a full text search that looks at the body, not an index keyword search. Is this correct? And, if so, any ideas as to why our full-text search feature is not working properly?
          • 2. Re: odd issue involving missing topics
            Pete Lees Level 2
            Hi,

            > So I regenerated the project "as is", expecting that now the missing
            > topics would no longer appear. This is where (for me), the real oddness
            > begins. The topics still appear - the only apparent difference that
            > regenerating the file did was that several screenshots disappeared (the
            > text remains)!

            If you right-click one of these ghost topics in the HTML Help viewer and then select Properties on the context menu, you can get the absolute path to the topic file from the Address (URL) field in the properties sheet. This should enable you to track down its source.

            > My understanding of the search feature in Microsoft HTML projects is that
            > it is a full text search that looks at the body, not an index keyword
            > search. Is this correct?

            Yes, that's right. The compiler parses the content of the individual HTML topic files and assembles a sort of database that maps each word to the topics that contain it.

            Is it possible that not every module in the merged help collection is as recent as you're assuming, or that there are junk HTML files lying around your hard drive that the compiler is somehow pulling in to the help files?

            Pete
            • 3. Re: odd issue involving missing topics
              t.a.thompson
              Just to verify, this is an HTML help project and not a Webhelp project, correct? I do know that Microsoft made some changes recently to the HTML help format to fix some security vulnerabilities. The main change, as I understand it, is that HTML can no longer run from a networked location. The .chm file is linked to a local machine or installation. I don't know if that's your issue or not, but it could be that some files are stored or installed locally, and others are accessed remotely. Follow Pete's advice, and move all the files to a local directory, or rebuild the .chm file.

              But on a different note, do you have a specific need for HTML help? I've found Webhelp to be much more flexible format, and doesn't have the restrictions that HTML help has. Webhelp doesn't require an internet connection per se, only a browser, and everyone has one of those.

              Anyway, I hope you get your issue resolved. If you do, please post solution here for the benefit of the community.
              • 4. odd issue involving missing topics
                Tech Writer KC Level 1
                > If you right-click one of these ghost topics in the HTML Help viewer and then select
                > Properties on the context menu, you can get the absolute path to the topic file
                > from the Address (URL) field in the properties sheet. This should enable you to
                > track down its source.

                I can see these files on my hard drive via Windows Explorer at the location shown in Properties, but they do not seem to exist in the project. Why would they show as broken links if they are in fact working external links?


                > Is it possible that not every module in the merged help collection is as recent as you're
                > assuming, or that there are junk HTML files lying around your hard drive that the compiler
                > is somehow pulling in to the help files?

                Both are possible, although Windows Explorer seems to do a good job of giving me creation dates for the .chm's. However, I'm not sure I follow how either would connect to the issue of whether the full-text search is actually searching text or merely searching index values - my understanding from the material in the help file is that the full-text search should work automatically as long as the search tab is selected (which it is for all projects).

                Searching this forum, I see that someone else has a similar issue with full-text searching apparently only searching index keywords. Has anyone else had this issue? If so, any solutions? I am really leaning towards this being the reason behind why some of our topics are not appearing in searches, as all the trouble topics seem to be the non-indexed ones.

                • 5. odd issue involving missing topics
                  Amebr-ke0mH4 Level 2
                  We have a problem we have to work around where the TOC shows a merged file, but search (and maybe the index, I can't remember off the top of my head) only shows topics in the Master project chm file.

                  This is caused by the paths being hardcoded in the [MERGED FILES] section. We think it's to do with source control (we use VSS)

                  Instead of the files appearing as
                  help1.chm
                  help2.chm

                  they appear as:
                  c:\project\help1.chm
                  c:\project\help2.chm

                  Solution:
                  compile the help.
                  open the hhp file, delete the paths and save the hhp file.
                  recompile the help.

                  We haven't found a way to solve this permanently unfortunately, so have to remember the double-compile each time we update the master project.

                  Hopefully this is some use to you.
                  Amber
                  • 6. Re: odd issue involving missing topics
                    Tech Writer KC Level 1
                    My apologies for asking such a noob question, but where do I find this "[MERGED FILES]" section? I'm not seeing anything in any of our project files (.xpj files) that appears to match this.

                    (we're using RH5 btw)
                    • 7. Re: odd issue involving missing topics
                      Tech Writer KC Level 1
                      > Just to verify, this is an HTML help project and not a Webhelp project, correct?

                      Correct.

                      > But on a different note, do you have a specific need for HTML help?

                      I can't really say at this point. This is the format I've inherited. My guess is they went with something relatively self-contained to avoid any potential issues, as my understanding is that some of our customer sites can be pretty remote and/or use fairly bare-bone systems.


                      • 8. Re: odd issue involving missing topics
                        Captiv8r Adobe Community Professional & MVP
                        Hi Tech Writer KC

                        This is what one would find inside the ProjectName.HHP file. This is the file used by the Microsoft HTML Help compiler when it creates a .CHM file. You would hand edit this file using something like Windows notepad to see this entry. Assuming it is there.

                        Cheers... Rick
                        • 9. Re: odd issue involving missing topics
                          Tech Writer KC Level 1
                          Looks like this may solve the search issue.

                          Opening the .hhp in notepad, the [MERGE FILES] paths are all pointing to a set of .chms in a "Recovered" folder located on my computer (no idea what may have happened or what was involved with that), rather than the current .chms that I update.

                          One odd thing I'm seeing when I follow the instructions provided by Amebr above: when I delete the paths and regenerate the .chm, it's entering the new paths in the [MERGE FILES] section twice (i.e. doubling them).

                          Any ideas as to what may be causing that?
                          • 10. Re: odd issue involving missing topics
                            Tech Writer KC Level 1
                            Although I think the search issue has been resolved (or resolved in large part), I'm still curious as to how the deleted topics are still appearing in the .chms.

                            My fear with these files is that there may be a lot of "debris" floating around the project files (for lack of a better term), and I've no idea how to begin trimming and stream-lining them to prevent (or at least limit) any future issues being caused by unmanaged / mismanaged linking and whatnot.

                            As such, I'm trying to understand how this material gets saved and linked when the obvious links (in the TOC) don't work or point in the wrong direction. Does anyone have good advice on this?
                            • 11. Re: odd issue involving missing topics
                              Pete Lees Level 2
                              Hi,

                              FWIW, the HTML Help compiler tries to pull into a .chm file all local files to which there is a link in one or more of these locations:

                              * An HTML topic file
                              * The FILES section of the project (.hhp) file
                              * The contents (.hhc) file
                              * The index (.hhk) file

                              So, for example, if one of your HTML topic files contains a hyperlink to another local HTML file, this second file will be compiled into the .chm file even if there is no reference to it in the .hhp, .hhc or .hhk file.

                              When there's any concern about incorrect source files being compiled into a .chm file, I think it's always a good idea to delete — or at least move to a different location — as much debris from the local drive as possible before compiling.

                              Pete
                              • 12. Re: odd issue involving missing topics
                                Tech Writer KC Level 1
                                Thanks to everyone for your help. It looks like the search issue has been solved.

                                Two final quick questions:

                                1) Is this still an issue with RH7? Will I still need to follow Amebr's steps everytime I generate a new master .chm when we upgrade to 7?

                                2) How does this affect automatic building? To me it would seem to make it impossible to automate the build and still have the search function working properly, given the need to manually correct the hhp file.