11 Replies Latest reply: Oct 4, 2007 6:13 PM by meshgraphics RSS

    RegExp multiline

    meshgraphics Community Member
      Can anyone make RegExp multiline work ? So far I've been replacing \n and \r with <cr> to flatten the string, then switching back the CRs.
      ..
      var re = new RegExp("(.*)<!-- OrdinarySiteMap -->.*<!-- OrdinarySiteMap -->(.*)","m");

      RegExp.multiline = true;
      re.multiline = true;

      var Slurp = re.exec(ExistOSMfile);

      if (Slurp) {
      var Topper = Slurp[1] + '<!-- OrdinarySiteMap -->';
      var Bottomer = '<!-- OrdinarySiteMap -->' + Slurp[2];
      ...
        • 1. Re: RegExp multiline
          Newsgroup_User Community Member
          meshgraphics,

          Most people assume that the . wildcard means "match everything", but it
          actually means "match eveything except \r and \n". Use (.\r\n) instead.

          Another tip: most people also assume that * means match *first* (i.e.
          smallest) match, but it actually means match *largest* match. This can
          lead to undesired results and unnecessary performance penalty
          (especially when searching across multiple lines). Use .*? or (.\r\n)*?
          for minimum match.

          HTH,
          Randy


          > Can anyone make RegExp multiline work ? So far I've been replacing \n and \r
          > with <cr> to flatten the string, then switching back the CRs.
          > ..
          > var re = new RegExp("(.*)<!-- OrdinarySiteMap -->.*<!-- OrdinarySiteMap
          > -->(.*)","m");
          >
          > RegExp.multiline = true;
          > re.multiline = true;
          >
          > var Slurp = re.exec(ExistOSMfile);
          >
          > if (Slurp) {
          > var Topper = Slurp[1] + '<!-- OrdinarySiteMap -->';
          > var Bottomer = '<!-- OrdinarySiteMap -->' + Slurp[2];
          > ...
          • 2. Re: RegExp multiline
            meshgraphics Community Member
            Ok Randy. Do most people know how multiline RegExp works? I don't see any evidence that you do.
            • 3. Re: RegExp multiline
              Newsgroup_User Community Member
              > Ok Randy.

              Ok, mr. anonymous.

              > Do most people know how multiline RegExp works?

              No. I've answered this question many times.

              > I don't see any evidence that you do.

              You didn't even try what I said. Excuse me for answering off the top of
              my head and getting the syntax slightly wrong, it should be this:

              (.|\n|\r)*?
              • 4. Re: RegExp multiline
                Newsgroup_User Community Member
                What I think is missing in DW is a "multiline" checkbox on the Search &
                Replace GUI. With this checked DW should let "." match newlines as well,
                just like the "\m" modifier does.
                After all, we have a "Match case" checkbox that works like "\i". Since DW
                doesn't allows modifiers inside the RegExp itself, all the available
                modifiers should be exposed by the GUI.

                As an alternative DW should understand modifiers that are entered inside the
                GUI (either \i, \m or whatever the engine supports).

                As it stands, it's not obvious at all for anybody who is familiar with
                RegExp. Using (.|\n|\r)*? is just a counterintuitive workaround.

                I asked for this features many times in the past. But I gave up a few years
                ago...


                --
                ----------------------------
                Massimo Foti, web-programmer for hire
                Tools for ColdFusion and Dreamweaver developers:
                http://www.massimocorner.com
                ----------------------------




                • 5. Re: RegExp multiline
                  Newsgroup_User Community Member
                  Hi Massimo,

                  Good to see you here!

                  It looked like the original poster was having problems in the code, so I
                  didn't even mention the UI :)

                  DW always does a multi-line search in the Search & Replace now (in CS3,
                  anyway -- can't remember which version we changed that). Of course, it
                  doesn't work if you use . for the reasons that I have already explained.

                  So, I'll mark your feature request as "fixed" :)

                  Randy


                  > What I think is missing in DW is a "multiline" checkbox on the Search &
                  > Replace GUI. With this checked DW should let "." match newlines as well,
                  > just like the "\m" modifier does.
                  > After all, we have a "Match case" checkbox that works like "\i". Since DW
                  > doesn't allows modifiers inside the RegExp itself, all the available
                  > modifiers should be exposed by the GUI.
                  >
                  > As an alternative DW should understand modifiers that are entered inside the
                  > GUI (either \i, \m or whatever the engine supports).
                  >
                  > As it stands, it's not obvious at all for anybody who is familiar with
                  > RegExp. Using (.|\n|\r)*? is just a counterintuitive workaround.
                  >
                  > I asked for this features many times in the past. But I gave up a few years
                  > ago...
                  • 6. Re: RegExp multiline
                    Newsgroup_User Community Member
                    "Randy Edmunds" <redmunds_nospam@adobe.com> wrote in message
                    news:fdu9lj$dor$1@forums.macromedia.com...
                    > It looked like the original poster was having problems in the code, so I
                    > didn't even mention the UI :)
                    >
                    > DW always does a multi-line search in the Search & Replace now (in CS3,
                    > anyway -- can't remember which version we changed that). Of course, it
                    > doesn't work if you use . for the reasons that I have already explained.
                    >
                    > So, I'll mark your feature request as "fixed" :)

                    Sorry Randy, but I don't think I explained my request well enough.

                    Modern RegExp engines supports modifiers:
                    http://www.regular-expressions.info/modifiers.html

                    DW doesn't supports modifiers inside the pattern. Most likely this choice
                    was made to make life easier for people who don't know RegExp too well (most
                    users out there). Instead there is a checkbox that mimic the way the /i
                    modifier works.

                    What's missing is something that let you switch between /s and /m.

                    You either provide GUI equivalents for all the modifiers supported by the
                    RegExp engines or let advanced users add modifiers inside the pattern
                    itself.

                    So yes, DW always does a multi-line search, but I want a way to turn it into
                    "single-line mode", where "dot" matches new lines too. See below (please
                    note it's very easy to mix up the terms):
                    http://www.regular-expressions.info/dot.html

                    Massimo



                    • 7. Re: RegExp multiline
                      meshgraphics Community Member
                      Attn: Tourists

                      This newsgroup is for people who have actually written a DW extension, not feature requests for newbies.

                      The DW extension environment is a subset of DOM level 1 and Microsoft Internet Explorer 4.0 DOM. This is far surpassed by modern web browser so testing your js in a web page doesn't help you with the finer points.
                      http://livedocs.adobe.com/en_US/Dreamweaver/9.0_Extending/index.html

                      The RexExp features listed at regularexpression.com do not matter when you are trying to troubleshoot your DW extension. What matters is: what actually works in the DW extension environment.

                      The purpose of the original questions and code snippet was to grab the top portion of a file, up to the marker, in its original format, so that it could be written as is, to the output file. According to old js, something like this would work:

                      Slurp = /(^.*)MARKER(.*?)/m.exec(MultilineInputString);
                      write(Slurp[1]); // <html><head><body>
                      write(Slurp[2]); // </body></html>

                      If you can make this work in a DW extension, let me know. Otherwise, enjoy the Brittney Spears news.

                      Jon Wojkowski
                      meshgraphics.com












                      • 8. Re: RegExp multiline
                        Newsgroup_User Community Member
                        meshgraphics wrote:
                        > Attn: Tourists
                        >
                        > This newsgroup is for people who have actually written a DW extension, not
                        > feature requests for newbies.

                        Massimo has written quite a few extensions, and Randy is one of the engineers that works on Dreamweaver, so it might be worth backing off a bit with the attitude if you intend to get help from folks.

                        > The RexExp features listed at regularexpression.com do not matter when you are
                        > trying to troubleshoot your DW extension. What matters is: what actually works
                        > in the DW extension environment.

                        They do to a certain extent, as they refer to the use of regular expressions within JavaScript, which Dreamweaver does use in extensions. Note this page:
                        http://www.regular-expressions.info/javascript.html

                        It states:
                        /m enables "multi-line mode". In this mode, the caret and dollar match before and after newlines in the subject string.

                        It also states:
                        Notably absent is an option to make the dot match line break characters.

                        Based upon both of these statement, the RegExp you present below does not do what you want it to do in JavaScript, regardless of Dreamweaver's involvement or not. The m flag does not allow the . character to match new lines in JavaScript and so you won't get from the beginning of the input string, rather you'll get from the beginning of the line that MARKER happens to be on.

                        > The purpose of the original questions and code snippet was to grab the top
                        > portion of a file, up to the marker, in its original format, so that it could
                        > be written as is, to the output file. According to old js, something like this
                        > would work:
                        >
                        > Slurp = /(^.*)MARKER(.*?)/m.exec(MultilineInputString);
                        > write(Slurp[1]); // <html><head><body>
                        > write(Slurp[2]); // </body></html>

                        What "old js" are you referring to that says that this should work the way you are saying it should?

                        For this code:
                        <html>
                        <head>
                        <title>My Document</title>
                        </head>
                        <body>
                        MARKER
                        </body>
                        </html>

                        This RegExp grabs the content from the beginning of the code before MARKER, and also the content after MARKER through the end of the code:
                        /^((?:.|\r|\n)*)MARKER((?:.|\r|\n)*)$/




                        --
                        Danilo Celic
                        | Extending Knowledge Daily : http://CommunityMX.com/
                        | Adobe Community Expert
                        • 9. Re: RegExp multiline
                          Newsgroup_User Community Member
                          > So yes, DW always does a multi-line search, but I want a way to turn it into
                          > "single-line mode", where "dot" matches new lines too.

                          Ah. As you know, DW uses the Netscape JavaScript engine. We have the
                          JavaScript source code and have made tweaks in the past, but they end up
                          being a maintenance nightmare in the long run, so hopefully this will
                          get fixed in JavaScript.

                          > See below (please note it's very easy to mix up the terms):
                          > http://www.regular-expressions.info/dot.html

                          Yes, a slightly easier and more clear technique to match everything in
                          JavaScript regular expressions is to use 2 opposing terms such as
                          [\s\S]. This matches all characters which are either whitespace or not
                          whitespace.

                          Thanks,

                          Randy Edmunds
                          Dreamweaver Engineering
                          Adobe Systems, Inc.
                          • 10. Re: RegExp multiline
                            Newsgroup_User Community Member
                            danilocelic AdobeCommunityExpert wrote:
                            > This RegExp grabs the content from the beginning of the code before
                            > MARKER, and also the content after MARKER through the end of the code:
                            > /^((?:.|\r|\n)*)MARKER((?:.|\r|\n)*)$/

                            And based on Randy's latest post this Regular Expression also works:
                            /^([\s\S]*)MARKER([\s\S]*)$/

                            Goes to show there are many ways to handle doing regular expressions, and you can always learn a new way to do the same thing.


                            --
                            Danilo Celic
                            | Extending Knowledge Daily : http://CommunityMX.com/
                            | Adobe Community Expert
                            • 11. Re: RegExp multiline
                              meshgraphics Community Member
                              Thanks to Randy and Danilo for their attention to this topic.

                              I've used this regexp with success

                              <snip>
                              tempfile = (/^((?:.|\r|\n)*)OrdinarySiteMap(.*) OrdinarySiteMap((?:.|\r|\n)*)$/.exec(ExistOSMfile));

                              if (! DWfile.write(localSiteURL+ "temp.html", tempfile[1]+
                              "xxxxxxxxxxxxxxxxxxmiddlexxxxxxxxxxxx" +tempfile[3] )){
                              alert("Write Failed: " + localSiteURL + "temp.html");
                              }
                              </snip>

                              The (?:) extension is not in Javascript 1.3 (c)1999 from the former devedge.netscape.com which is archived in one form or another at
                              http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Guide:Creating_a_Regular_Expressi on

                              http://perldoc.perl.org/perlre.html (which is pretty old) says things like:
                              ...
                              This is for clustering, not capturing; it groups subexpressions like "()", but doesn't make backreferences as "()" does. So...
                              The stability of these extensions varies widely...
                              ...

                              I've never checked out Microsoft perversion on it but there is something here:
                              http://msdn2.microsoft.com/en-us/library/yab2dx62.aspx

                              Peace out,
                              Jonny