Copy link to clipboard
Copied
Hello All,
I am using RH8 to create WebHelp output. My client has many many PDF files that I have added as baggage and linked within the topics. When I compile the project, the resulting WebHelp does not search within the PDF files.
The PDFs are baggage and are linked at least once within at least one topic. They are also in the TOC, though I read somewhere that that has no bearing on Search. Most, if not all, of the PDFs are searchable from within Acrobat, and I can select and copy text from them -- they are not simply images.
Does anyone know why RH wouldn't search inside the PDFs? This is an important feature and the project deadline is this week!
Thanks,
-jeff
Copy link to clipboard
Copied
Hi there
The OP never came back to say whether the suggestion worked or not, but the thread linked below may be of some interest.
Cheers... Rick
Helpful and Handy Links RoboHelp Wish Form/Bug Reporting Form Begin learning RoboHelp HTML 7 or 8 moments from now - $24.95! |
Copy link to clipboard
Copied
Hi Rick,
Thank you for such a quick reply. I examined the SALIdxSvc12.xml file and
PDF is listed in the file. In fact, PDF is listed first. So I'm afraid that
isn't the issue.
Does anyone have any other suggestions? I'm grateful for any help.
Incidentally, for those who are curious, here is the path, on my Windows XP
system, to the SAL index file that Colum discusses in his blog:
C:\Program Files\Adobe\Adobe RoboHelp 8\RoboHTML\SALIdxSvc12.xml
Thanks,
-jeff
Copy link to clipboard
Copied
Hello again Jeff
Have you tested in different browsers? For example, maybe it works in FireFox but not IE. (Or vice-versa)
Also, I see you mention a LAN server? Have you tried placing the content on a web server?
Cheers... Rick
Helpful and Handy Links RoboHelp Wish Form/Bug Reporting Form Begin learning RoboHelp HTML 7 or 8 moments from now - $24.95! |
Copy link to clipboard
Copied
Hi Rick,
It does not work in Firefox either. I have an Excel file in the baggage, and
that was indexed, so RH is indexing external files, but it doesnt seem to
be indexing the PDFs.
No server of any kind is involved. I am storing, producing, and testing
everything locally.
-jeff
Copy link to clipboard
Copied
Hello again
Hmmm, for some reason I thought you said you were testing from a network server. Sorry, my mistake.
I think I've seen Peter ask what was used to create the PDF files. Perhaps you might try creating a simple PDF on your setup that contains an odd term such as redrabbit and linking to it to see if the PDF content appears. Maybe different PDFs are created in different ways and some of those ways result in an inability for RoboHelp to search through them.
Admittedly a bit stumped on it.
Cheers... Rick
Helpful and Handy Links RoboHelp Wish Form/Bug Reporting Form Begin learning RoboHelp HTML 7 or 8 moments from now - $24.95! |
Copy link to clipboard
Copied
Thanks Rick, I'll try that. It may take a little while but I will post back
the results of my test when I have a chance.
Thanks so much for your help,
-jeff
Copy link to clipboard
Copied
Asking about what was used to create the PDF is related to importing PDFs. Not aware searching is affected but worth testing.
See http://www.grainge.org/pages/authoring/rh8/using_rh8.htm Item 22
See www.grainge.org for RoboHelp and Authoring tips
Copy link to clipboard
Copied
I asked the original question on the thread that is now locked. We are using Robohelp 8 on a Windows 7 64-bit computer. We are now using Adobe Acrobat 9 Pro to create PDFs, but RH8 is not searching *any* of the PDFs I've attached as the baggage files, including the ones I created in 2007-2009, using previous versions of Acrobat.
Here's some more information, which may or may not be pertinent:
When I generate the Webhelp on my local computer, I can search the PDFs just fine. It's not until the help gets generated from the command line using our build computer that PDF search stops working. For example, if I look for a specific term in the help I generate on my computer directly from RH8, I get 10 results, one of which is within a PDF. When I perform the same search on the help generated by our build computer, I get only 9 results--the PDF file is not listed.
There doesn't appear to be any differences if you view and search the Webhelp with Internet Explorer, Firefox, or Chrome. I simply can't search PDFs in the help generated using the command line. I checked the SALIdxSvc12.xml on both my local computer and the computer used for the build, and both files include the <Type Ext=".pdf"/> element. Both computers also appear to process "full-text search data" in about the same amount of time.
The only difference I found was when comparing files between the help I generated on my PC and the help generated on the build computer. My copy of one specific project includes 24 files that were missing from the build's copy: 12 whxdata\package_x.xml files and 12 whgdata\whlstflxx.htm files. I don't know exactly what these files do, but they appear to be alphabetized lists of keywords--are these the files created for searching PDFs? If so, why aren't they being generated by the command line script??
The only other odd thing I'm seeing is that since upgrading to RH8, when I try to open an attached PDF in generated Webhelp, I frequently get an Adobe Acrobat error message that is blank. The PDF does not appear until I click the link a few times more. Has anyone else seen this error?
Copy link to clipboard
Copied
It may be stating the obvious but if all the other projects contain PDFs that can be searched but one fails, surely there has to be something different about the command or the target folder. The command does not (cannot) exclude production of the wh* folders and files.
If you generate the rogue project again locally to a clean folder, is it still producing the files you report are missing in the command line version?
See www.grainge.org for RoboHelp and Authoring tips
Copy link to clipboard
Copied
Sorry, I didn't make myself clear. None of the PDFs in any of the projects are searchable when they are generated from the command line. All of the PDFs are searchable when they are generated on my local computer. I'm going to try to run the command line script locally to see if I can narrow things down a bit more--I'm not sure if it's a command line script issue or something different between how the two computers are set up.
Copy link to clipboard
Copied
Hi folks
Another thought occurs here as well. You state that search works fine if you manually generate the WebHelp but it fails if the automated process generates. So I'd be closely scrutinizing the manual process. I've never used it myself, but I understand that it works by specifying a Single Source Layout. So I see the possibility that you are manually generating WebHelp using layout X but the manual process generates using layout Y. And perhaps there is a setting that is different between the layouts and influencing things.
Just thinking out loud... Rick
Helpful and Handy Links RoboHelp Wish Form/Bug Reporting Form Begin learning RoboHelp HTML 7 or 8 moments from now - $24.95! |
Copy link to clipboard
Copied
Oh, and one other thing. Apologies if this has been already handled. But I don't recall seeing it anywhere.
Does search still work if you manually Generate and upload the files to the server? In other words, you test and it works off your local PC. But if you manually update the server does the search still find the PDF afterward?
Cheers... Rick
Helpful and Handy Links RoboHelp Wish Form/Bug Reporting Form Begin learning RoboHelp HTML 7 or 8 moments from now - $24.95! |
Copy link to clipboard
Copied
Some good suggestions. Thanks. I don't think it has to do with the SSL, however, since I'm simply using the primary Webhelp layout and have not changed the name.
But I did try this little experiment and got some interesting results:
1. I generated the help from Robohelp and published it to the server. PDF search works fine on the published help for most PDFs. However, I just noticed that a few PDFs are not being searched. But that may have been the case all along--I still need to confirm that.
2. I generated one project (the one with the most PDFs) from within Robohelp to a folder on my PC. I then searched the project for the word "camera" and got 9 hits, 6 of which were in PDF files.
3. I then deleted that output and used the rhcl command to generate the same project to the same location. Note that I used the simplest form of the command (which uses the default layout and the default output folder). Everything looks the same, but when I search the project for the word "camera," I get only 3 hits--the PDFs files are not included in the search results.
In addition, when I generated the project from the command line, the following lines were displayed:
"Scanning project for compilation....
Scanning finished.
Warning: No baggage file description.
Starting compilation...
So the problem appears to be that there is no baggage file description. Any idea if there's a way to add one manually?
Copy link to clipboard
Copied
For what it's worth, I can now reliably reproduce the problem. Here's what I did, maybe someone can try this to see if they get the same results.
Now instead of using RH to generate the Webhelp, I used the Windows command line editor from the start menu. I changed the directory to the RH program folder, then typed rhcl c:\testpdf\testpdf.xpj. The project was re-generated but with a warning that no baggage file description was found. And sure enough, when I viewed the generated output, I could open the PDF, but a search of the webhelp did not find the term I had previously located.
Any ideas what could be going wrong? I'm assuming that lots of other people are using rhcl to generate webhelp that includes fully searchable PDF files...
Thanks,
Kathy
Copy link to clipboard
Copied
Kathy
I just set up a test and followed your instructions. Got the same result as you. The help does not indicate anything else that needs to be added to the command line so I don't know why this isn't working.
See www.grainge.org for RoboHelp and Authoring tips
Copy link to clipboard
Copied
Thank you so much, Peter. Even if you don't know why it's happening, I really appreciate you taking the time to verify that the problem exists for others, not just me. I'll assume it's some kind of bug and will submit it to Adobe. Maybe it'll be fixed in RH9.
Thanks again,
Kathy
Copy link to clipboard
Copied
Here's a workaround: