Me, I use PDF Creator (v1.7.0), which exhibits the problems as I described. But really I was asking the questions on behalf of a colleague who produces a pdf newsletter, which has these two problems when I read it. I will find out what package he uses.
But as for the long URL in your test pdf, that is fully live, in my Reader! (Not that I clicked on it: I could tell from the pop-up when hovering the mouse). So my problem (1) appears to be creator-dependent?
As for my problem (2), I should have mentioned that it applies to any text with an intended "hard" hyphen at the end of a line, not just in URLs: the copy function in my Reader treats it as a soft hyphen and ignores it. So if you are creating a pdf, you simply need to beware of leaving a hard hyphen at the end of a line, particularly in a URL...?
You are probably relying on the automatic hyperlinking in Adobe Reader. This is a very simple scan for things that look like URLs.
If you want more complex links (e.g. not to the same as the text in a single line on the page), you have to actually add the links explicitly, perhaps in Acrobat.
The removal of hyphens is a normal part of copy-paste processing, and not specific to URLs. Of course, it assumes hyphens are serving to split a word, which isn't always the case, but you can't turn it off.
Thank you, that helps to clear things up:
(1) A complex link (short text with underlying full URL) is needed. Although this is easy to produce in an original Word doc (Insert Hyperlink), unfortunately when I then create the pdf, my way, the link is lost. I guess that Acrobat would retain it?
(2) End-of-line hyphen removal is unavoidable when copy-pasting in Reader (though it does not seem to be a feature of Word, for example).
1) If I add a hyperlink in Word then use the Acrobat add-in to Word, to make the PDF (not print-to-PDF), I get a working matching hyperlink.
2) Actually Word does do this sometimes. Word will remove an automatically inserted hyphen (put there as part of automatic hyphenation) when you copy and paste, but not a hyphen which is the product of splitting a hyphenated word. The problem is that, once you have a PDF, either kind of hyphen is just a single, undistinguishable, character. [Except for a tagged PDF]. Adobe had to decide to always-remove or always-leave. I don't know if they made the right call, but there has probably been analysis of the two kinds of hyphen-at-end-of-line and their relative frequency.