11 Replies Latest reply: Nov 30, 2012 10:00 PM by Jamil Jonna RSS

    Turning off Hyphenation for URLs

    Jamil Jonna Community Member

      Hello,

       

      There was quite an old trick for turning off hyphenation of URLs but it no longer seems to work. I used the following GREP find to get style my URLs:

       

      ((http://www\.|www\.|http://)\w+(\.|/)(\w+(\.|/))*\w+)|[\l\u\d_%-]+@[\l\u\d_%-]+(\.|/)\w+(\.|/)\w*

       

      But when I apply the character style "[No Language]," the URL still hyphenates—as you can see in the attached screenshot. The URL in question is: http://thewallstreetjournal.org

       

      Does anyone know why this is?

       

      no-hyphenation.png

        • 1. Re: Turning off Hyphenation for URLs
          [Jongware] MVP

          You are probably stretching the possibilities too far.

           

          InDesign does not like hyphenating [No Language] text, because it doesn't know how to. On the other hand ... "thewallstreetjournal" is a single 'word', and you can see the effect of not hyphenating it by applying No Break. Since it's the 4th line of a very tight column, the spacing is going to be awful. InDesign assigned a value to "godawful spacing" <-> "inserting a random hyphen" and chose the Least Worst Solution.

          • 2. Re: Turning off Hyphenation for URLs
            Eugene Tyson MVP

            I tested this on a rather long url

             

            Premise is that a URL can't have spaces, so this looks for anything before the "www" but attached to it right up to the very end of the URL - given there are no spaces.

             

            (?<=\s).+?www.+?(?=\s)

             

            It probably won't work in all cases though

            • 3. Re: Turning off Hyphenation for URLs
              Simon Dav Community Member

              You can use it GREP in paragraphs styles

               

              [\l\u\d\.]+@[\l\u\d\.]+

               

              the character style should be "No Break" in character formats, this will work.

               

               

              simon

              • 4. Re: Turning off Hyphenation for URLs
                Jamil Jonna Community Member

                I'm not sure why people are giving me GREP suggestions: mine works just fine, thank you. It is pointless to try and account for every situation where a long string might exist in a URL. It should be possible to style the whole URL, and still allow breaks at periods, after "http://," before/after "www," etc.

                 

                The problem is the inclusion of the hypehn. If someone were to type in the URL, they would think it actually had a hyphen (since some do) and that isn't acceptable. Further, I would like a break to occur (at the periods, or even between characters) but the break should not have a hyphen. As my link indicates above, the behavior did exist in the past.

                 

                @Jongware "InDesign does not like hyphenating [No Language] text, because it doesn't know how to." I could care less if Indesign "doesn't like" hyphenating No Language: the point is, it shoudn't since it has no basis to do so logically. I see no reason to rely on Indesign to handle every detail of my layout since I can track the minor cases in which long URLs might exist. The problem is that Indesign actually changes the location by adding an extra character. This is the reason why people created the trick to prevent hyphenation in the first place.

                • 5. Re: Turning off Hyphenation for URLs
                  Peter Spier ACP/MVPs

                  It is certainly a pain it the neck (and probably breaks the hyperlink if you expect it to auto-generate as in Reader, for example), but have you considered adding your own discretionary line breaks to URLs to control how they break?

                  • 6. Re: Turning off Hyphenation for URLs
                    Joel Cherney MVP

                    @Jongware "InDesign does not like hyphenating [No Language] text, because it doesn't know how to." I could care less if Indesign "doesn't like" hyphenating No Language: the point is, it shoudn't since it has no basis to do so logically.

                     

                    I share your beef with InDesign's tendency to hyphenate stuff that it should not hyphenate (like things marked with No Language) but I think you are misreading Jongware's brevity. When he says "doesn't like" I think he means that there is a heirarchy of techniques for altering composition which the Adobe Paragraph Composer will use, and while hyphenation of No Language should be completely forbidden from our point of view, it is instead on the list of possible composition-altering tools that the Paragraph Composer will use. I don't think he'd disagree with you in your claim that the Paragraph Composer should not hyphenate No Language-marked text, simply that it does when you push it.

                     

                    In terms of resolving your problem: I created a No Break character style, and then applied it with a GREP style using your GREP. It prevented URLs from breaking at all. Then I sat down to adjust your GREP query to try to use it to add some discretionary line breaks, then Peter posted his suggestion. I haven't been able to adjust your GREP query to perfectly add discretionary line breaks, and I think that this:

                     

                    Find: ([\l\u\d]+)(\.)([\l\u]+)

                    Replace with: $1$2~k$3

                     

                    will add discretionary line breaks to most of your URLs, and this GREP style

                     

                    Apply Style: No Break

                    To Text: [\l\u\]+\.

                     

                    will permit hyphenation in non-URL content, yet prevent hypenation of URLs. It worked perfectly in my test, but you will almost certainly need to fine-tune your initial GREP query to catch all of your potential  %20-containing URL permutations. The only URL in my sample was actually your WSJ url, and I see that your URL-finding GREP query accounts for far more permutations than a simple "letters separated by a period" that mine finds.

                     

                    I don't know why your old method stopped working, to be honest. I suspect it is because of behind-the-scenes changes in the way the Paragraph Composer operates. I am so sick of improper hyphenation of stuff marked with No Language, and in general of Proximity algorithmic hyphenation, especially in non-English languages, that my general advice amounts to "Turn off all hyphenation everywhere unless it's absolutely necessary."

                    • 7. Re: Turning off Hyphenation for URLs
                      [Jongware] MVP

                      > When he says "doesn't like" I think he means that there is a heirarchy of techniques for altering composition which the Adobe Paragraph Composer will use, and while hyphenation of No Language should be completely forbidden from our point of view, it is instead on the list of possible composition-altering tools that the Paragraph Composer will use.

                       

                      I came to that statement by looking at the position of the URL inside the paragraph. Not breaking it would most likely stretch word spacing beyond a reasonable limit. Disagreeing with Adobe, however, on "what's reasonable", and what rules can be bent and what rules can be broken, is perfectly alright. There is only one catch: you can disagree all tou want but it won't change a thing.

                       

                      If you think it would be better to break *anywhere*, if only there wouldn't be a hyphen, there is actually a code for that. Look for Discretionary Line Break.

                      • 8. Re: Turning off Hyphenation for URLs
                        Jamil Jonna Community Member

                        I understand what a discretionary line break is—thanks. My question from the beginning was simple: what is [No Language] supposed to do if, in the past, it had the predicted effect of avoiding hyphenation? In the end, the URL issue really isn't that important since I can manually deal with long URLs should they appear.

                         

                        I'm more interested in why this behavior changed. Joel seems to indicate an answer: the paragraph/word composer. I changed to the newer composer available in CS6 so I'll just try switching back. We'll see if the [No Language] behaves as expecetd. There may be other instances where I need such behavior, which is why I'm interested.

                        • 9. Re: Turning off Hyphenation for URLs
                          Joel Cherney MVP

                          I changed to the newer composer available in CS6 so I'll just try switching back. We'll see if the [No Language] behaves as expecetd. There may be other instances where I need such behavior, which is why I'm interested.

                           

                          I have been using the World-Ready Composer since its unadvertised undocumentd under-the-hood introduction in CS4. I have noticed no differences whatsoever between Latin-script paragraph composition in the ordinary Paragraph Composer and the World-Ready Composer. So I don't think that the WRC is going to get you what you want, here.

                           

                          And, when thinking about it again, I don't know if the behavior of the Paragraph Composer has changed or not. Because the rules by which it composes paragraphs are not, to my knowledge, available to us as end users, we can't know for sure. So I tried some experiments: I made a one-page-sized text frame full of lorem ipsum. I set the language to English so it'd hypenate. I then set the text "www.thewallstreetjournal.com" and applied No Language to it. I then sprinkled that URL throughout the sample text. I then duped this file and made separate files for CS3, CS4, CS5.5, and CS6. I then opened each one in its respective version and manually resized the text frame to make it very narrow. As you can see, I'm trying to recreate your issue, here.

                           

                          At the point at which I would have expected .thewallstreet. to hyphenate, I got... an "overset text" marker. I duplicated your environment as far as I could, but could not recreate your issue. In each version of InDesign, the No Language setting did exactly what you wanted.

                           

                          So, there are very few possibilities, here. But so far as I can tell, even on CS6 the "no language" setting still behaves as I would expect. I'm flabbergasted, honestly; I thought your issue would be easy to recreate. Can you maybe share the file in which this is happening? Or at least save out to .idml and reopen to see if this is being caused by corruption in your document?

                          • 10. Re: Turning off Hyphenation for URLs
                            Joel Cherney MVP

                            Eep. Never mind. </gildaradner>

                             

                            I just tried to recreate your example, even going so far as to OCR the text in your screenshot. I misread your last post - I thought that you were trying out the WRC to see if it'd fix your problem. I now see that using the WRC actually induced this issue, and yes, for sure you have found a bug in the WRC. It seems to be ignoring the "No Language" criteria and hyphenates the URL when it shouldn't.

                             

                            Furthermore, it's a new bug in CS6 in the WRC - when I do the same test in CS5.5, all four composers (I didn't try either Japanese composer) respect the No Language setting and refuse to hyphenate the URL.

                            • 11. Re: Turning off Hyphenation for URLs
                              Jamil Jonna Community Member

                              Yayy! So nice finally to be understood I figured as much. Anyway, I hope they'll get it fixed in an update soon.

                               

                              Thanks for your help, Joel!