Are you sure the problem is related to tagging? If you create a PDF without tags does the problem go away? If so, what version of Acrobat are you using on what platform? I have encountered this problem with legacy PDFs created by others, so I don’t know what caused it. My work-around has been to use the alt text property to correct the text. Link spelling should also be corrected in the <Link> structure element. Hope this helps.
a ‘C’ student
Hi, thanks a lot for your reply. Yes, the problem is indeed related to tagging. I am using Adobe Acrobat 11 on Windows 7. After some tests, I was able to understand what causes the issue. It happens when you add text in the "Actual Text" field in the tag properties. Here is a simple way to reproduce the issue. Open a Word document and write two lines of text separated by some hard returns, for example you can write:
First line of text
Second line of text
Then you convert the file to PDF using ADOBE ACROBAT, which by default does not add tags. After the file is converted to PDF, you open it and add the tags using the "Add Tags to Document" option in the "Accessibility" menu. Now, go in the tag trees, go in the <p> tag that contains the text, go to properties, and add some text in the "Actual Text" field. This will cause the text selection issue. Here is a sample PDF file that I created following these steps.
Good to hear you solved the problem!
Well, I was only able to understand what causes the issue. However, not using the "actual text” property because it can cause text selection problems is not a real solution, as you may need to use this feature in your documents. I hope that Adobe will fix this problem in a future update.
1 person found this helpful
Surely this PDF is wrong. ActualText is for use when normal text extraction techniques will not produce the same results that would be perceived by a person with vision. In this case the ActualText of "First line of text" is different from what a person with vision would see "First line of text Second line of text". So, you are not using the tags correctly.
It does raise an interesting question, since ActualText should be controlling text extraction, and probably selection too. All a PDF viewer has is a piece of extractable text (ActualText), and a collection of words on the page which the ActualText replaces. Given that it is impossible to break down the specific location of individual characters in the ActualText, so it is impossible to highlight them individually. Obviously the tag could be ignored, but then it is impossible to discover which elements of the ActualText were selected, so copy will fail. I suspect ActualText should be used at the smallest possible elemental level, perhaps for each single character that cannot be represented.
Thank you very much for your reply. If I got it right, you are saying that the tag was not used correctly. So, are you saying that in order to use the tag correctly, you should put "First line of text Second line of text" in the "actual text" property of the <P> tag containing the text?
Because if this is the case, the problem is that any text you put in the "actual text" field will cause the text selection issue. So, even if you use the tag correctly, the issue will still be here.
You can copy the text correctly if you use the "copy with formatting" option in Adobe Acrobat though.
Now, it is interesting that this problem only happens with Adobe Reader, as if you open the file with any other PDF viewer (such as Foxit, Sumatra, Nitro PDF, and all the other ones I tested) there are no text selection issues. All text is perfectly selectable and searchable.
I don't know about those viewers, but I imagine many of them ignore tags, so they won't be affected by the incorrect use of tags...
Have you read the specification of how that tag is supposed to be used? Unless I hear otherwise, I'd repeat that the correct use is to replace the smallest possible unit of text: preferably a single character. Certainly never whole sentences. What are you trying to achieve with it?
Thanks for your help. Well, the problem that I am having is that no matter if I use the tag correctly, I still have this text selection issue in my documents. Even when using the tag to replace a single character. I don't think this text selection issue is related to the correct use of the tag itself but it's probably a bug, and thus I hope it will be fixed in a future update.
Recreated the problem. Bizarre. I’m with 3MXO - Adobe please fix this bug. While you are at it could I please have a way to set internal links like TOC entries to “Inherit Zoom” by default?
a 'C' student