3 Replies Latest reply on Sep 19, 2017 3:28 AM by Test Screen Name

    Bug in Acrobat and Reader? Dash in multi-line URL removed when copying

    OsakaWebbie Level 1

      I design a magazine in InDesign, and then export to a PDF for use in the online archives. Until now I haven't bothered making URLs into actual hyperlinks, but one of the proofreaders decided to test a URL by copying it from the PDF into his browser - he got a "page not found" error. I investigated and found out that the URL was correct in the document but was getting changed during the copy operation. These days, many URLs are quite long and have lots of words separated by dashes (because of the default way Wordpress creates URLs from post titles), and sometimes they need to break into two lines on the page. If the break is at one of the dashes and a viewer of the PDF copies the URL in order to paste it into their browser, the text that ends up in the computer's clipboard (on both Windows and Mac) is missing that dash, apparently because Acrobat and Reader think it's a hyphenated word.

       

      Try it with this PDF: Dropbox - Member Care without hyperlinks.pdf Open it in Reader or Acrobat, highlight one of the URLs in the endnotes, copy, and paste into your browser. The dash in "the-different" or "baby-boomers" will go away, leaving you with a URL in your browser that doesn't work. Oddly, when I hover over the URL in the PDF, the little hint box shows it correctly, so I'm not sure where in the process the dash is getting removed.

       

      I know it fails in Acrobat Pro DC on Windows 10 (my machine), Reader DC on a different Windows 10 machine, and some version of Reader on a Mac. It works correctly in the PDF viewer of Google Chrome on my PC, so it appears to be a bug in the Adobe code, not something in the OS.

        • 1. Re: Bug in Acrobat and Reader? Dash in multi-line URL removed when copying
          Test Screen Name Most Valuable Participant

          Don't think it's a bug. Rather it is working as designed and deleting end-of-line hyphens. If you want it to do something different - either never do that or apply some fuzzy logic to guess when what it copies might be a URL, I suggest a feature request.

          1 person found this helpful
          • 2. Re: Bug in Acrobat and Reader? Dash in multi-line URL removed when copying
            OsakaWebbie Level 1

            I wish I had the option to never do that. But when the string of "hyphenated" words is as long as "how-to-motivate-the-different-generations-in-the-workplace" (or another one: "japanese-translations-of-the-bible-characteristics-and-suggested-uses"), it's difficult to avoid. If I intentionally force InDesign to hyphenate a multi-syllable word instead of breaking at an existing dash, the copy/paste operation works, but readers of the printed magazine might think the hyphen is part of the URL. I need to prioritize the printed document.

             

            I assumed that the difference between a literal dash and a soft hyphen was somehow transmitted from InDesign to the PDF, but apparently that distinction is lost in the export, or the PDF format itself has no capacity to differentiate. [And I take back my statement that Chrome keeps the dash - I retested and see that it's also deleting it in some circumstances.] Why does it know enough to show it correctly when I hover? Also, if I tell Acrobat to make hyperlinks of all URLs in the document, it gets them right. I guess something inside there is doing the fuzzy logic you mentioned, but does not do so when copying. Oh well... I can mitigate the issue by making actual hyperlinks, which I hope will encourage the reader to Ctrl-click instead of highlight->copy->paste. Thanks for your reply.

            • 3. Re: Bug in Acrobat and Reader? Dash in multi-line URL removed when copying
              Test Screen Name Most Valuable Participant

              The PDF is just the characters you see, so all the niceties of styles and the type of hyphen are lost. When you hover? Well, it's different code, and to me the amazing thing is that it manages a multiline URL at all. Still, I agree it could be improved. I'm quite sure the text copying code is much older than the URL detecting code. Adobe may be loathe to touch it because it means text will extract differently, and that incompatible change will cause other people pain. But a feature request will do no harm.

               

              A key problem is the natural wish to use PDF to distribute things other than a page you can see and print; something it's actually not very good at, but who wants to distribute two different things...

              1 person found this helpful