I have an issue that I've exhausted with days of Google searches. I have a text catalog created with ID CS6. Sections were PDF documents given to me from our engineering department. The book was created, exported and all went well until I found out that when I try to copy and paste the text into Microsoft Word or Note Pad, I get a lot of small boxes in place of the letters. When I try to search the text in Adobe Reader, I can't find words which I can see are there.
This is the first catalog I've created in CS6. Previous catalogs created in the exact same manner in CS4 & CS2 do not have this issue. I thought it could have been the infamous Arial Narrow font issue and researched that to death.
Any ideas would be most helpful. This is a 1300 page catalog that customers cannot search.
I have a text catalog created with ID CS6. Sections were PDF documents given to me from our engineering department.
Can you search those PDF documents from your engineering department? I.e. before inclusion in the book? Were those PDFs placed into InDesign? How were they generated?
If you can search those PDfs before book inclusion, it sounds like you may have a "refried PDF" encoding problem. That process - taking a PDF and re-PDF-ing it - can generate problems, and if Dov shows up in this thread he'd advise against it, I bet. I would wonder how the fonts were embedded by your engineers, and if those fonts are the same fonts that are being re-encoded in a non-searchable way. I'm guessing that you have some Identity-H in your engineer's files (open PDF in Acrobat -> Properties -> Fonts tab -> Encoding) that is colliding with the re-encoding of an identically-named font in the parts of your book that are not your refried PDFs.
That's just a guess, though. More details about the engineering PDFs will help us figure out what is going on with a bit more certainty.
Yes, I can search the PDF documents from the engineering department with no issue prior to inclusion into the book. I did place the PDF pages into ID and then they were generated as an interactive PDF.
Previous years, the PDFs generated by engineering were done via Crystal Reports. In reviewing the current files, they show:
PDF Producer: PrimoPDF
PDF Version: 1.3 (Acrobat 4.x)
The fonts show:
Arial (Embedded Subset), Type: TrueType, Encoding: Custom
Arial, Bold (Embedded Subset), Type: TrueType, Encoding: Custom
The previous years file show:
PDF Producer: Powered by Crystal
PDF Version: 1.5 (Acrobat 6.x)
The fonts show:
Arial (Embedded Subset), Type: TrueType, Encoding: Built-in
Arial, Bold (Embedded Subset), Type: TrueType, Encoding: Built-in
Huh. Well, it seems like my hunch was incorrect. It might be an encoding collision, but I can't say. I wonder what it is about PrimoPDF output that doesn't work this time around. I'm just guessing here, you may have already tried all of this, but - what happens if you export as a print PDF? Is it then searchable? How about if you just place one PrimoPDF-generated PDF into ID, then just export that? Is that searchable? Because there are a variety of things changing here from your previous functional searchable PDFs:
CS4 -> CS6
Crystal Reports -> PrimoPDF
Print -> Interactive (because Interactive PDF was unavailable in CS4 IIRC)
and if I were in your shoes I'd be playing with those variables trying to figure out which one, or which combination, was creating the problem. Unfortunately, you're dealing with circumstances I've not experienced, so I have no hard&fast answers for you.
I've tried many, many combinations and none of them work. I just tried creating a new ID document, placed a page from the PDF (PrimoPDF generated) and exported as both print and interactive. Neither allowed me to search. When I copied the text from the PDF into Note Pad, I get small boxes in place of letters.
Performing the same test with a PDF file from Crystal in both print and interactive, the file is searchable and the text can be copied and pasted properly into Note Pad.
In looking at the Primo PDF manual, the PDFs need to be created using the Prepress profile in order for fonts to be embedded. That might be the issue with the fonts.
As for the searchability, no idea. Might be nice to have one of the offending PDFs. Even a dummy document created with Primo if the information is sensitive.
Did you try printing the PrimoPDF PDFs to PostScript and distill to PDF?
Maybe that will change the game…
You could do that from Acrobat Pro. Or place the PDFs in InDesign and print from there*
*to do that you will need the "ADPDF9.PPD" in a fresh new folder named "PPDs" in the "Presets" folder of your InDesign CS6 application folder (at least on Mac OSX).
First off, I wish to espress my gratitude to all for your responses.
We've found the best solution to fix this issue is to purchase Acrobat for the engineers. We tested it and the problem was fixed.
You guys rock!