8 Replies Latest reply on Dec 28, 2016 9:16 AM by BKBK

    Any regExp expert here?

    ghfftttyyudsderycv76 Level 1

      How would you handle one or more hyphens in words? For instance, "T-cell" is a word,

      <cfset w = "T-cell">,

      REreplace("a long text string T-cell and more...", "#w#", "<a href=''>#w#</a>")

      The above REreplace function failed to identify "T-cell" as one word since hyphen is used as an indicator for a range such as [a-z], according to Adobe documentation, thus, one needs to add the hyphen to the end to reference it as a literal, I attempted to do the following:

      REreplace("a long text string T-cell and more...", "#w#-", "<a href=''>#w#</a>")  or

      REreplace("a long text string T-cell and more...", "[#w#-]", "<a href=''>#w#</a>")

      However, it misfired and caused serious problems. What's the correct regExp for this situation?

      Many thanks

        • 1. Re: Any regExp expert here?
          cp_anil@rediff Level 1

          Sorry if I have misunderstood your question, but if your intention is to change the word "T-cell" into "<a href=''>T-cell</a>" in the given example, the first option given by you itself will work fine.

          There is nothing wrong in that at least what I could find. Because the hyphen would create an issue if it is inside a square bracket only.

           

          So the piece of code

           

          REreplace("a long text string T-cell and more...", "#w#", "<a href=''>#w#</a>")

          will return

           

          a long text string <a href=''>T-cell</a> and more... 

           

          Please correct me if I have missed anything in your question.

          • 2. Re: Any regExp expert here?
            BKBK Adobe Community Professional & MVP

            ghfftttyyudsderycv76 wrote:

             

            How would you handle one or more hyphens in words? For instance, "T-cell" is a word

             

            That implies that you wish to replace T-cell with T*cell, where the * stands for a character other than - or no character at all. But then it seems from the rest of the text that that is not the question you wish to ask.

             

             

            REreplace("a long text string T-cell and more...", "#w#", "<a href=''>#w#</a>")

             

            As cp_anil@rediff correctly says, this suggests that you wish to replace T-cell with <a href=''>T-cell</a>. If that is the case, then you could avoid regular expressions altogether and use the simpler,

             

            replaceNoCase("a long text string T-cell and more...", w, "<a href=''>#w#</a>", "all")

            • 3. Re: Any regExp expert here?
              ghfftttyyudsderycv76 Level 1

              Well, I've solved the problem.

               

              But to appreciate your response and for the benefit of others...

              I initially simplified the problem statement.  That is, I first read a simple HTML file into a var, then, ran

              REreplace("#varName4longtext.", "#w#", "<a href=''>#w#</a>")

              when var w contains "-" the above REreplace statement failed to identify w value in its entirety such as "T-cell", thus the question.

               

              This is how to solve the problem after posting the question, I first replaced all of the occurrences of "-" into something else such as DASH and then ran REreplace (and probably replace would suffice as well...) and once done, I then revert the original "-" back.

              • 4. Re: Any regExp expert here?
                BKBK Adobe Community Professional & MVP

                Thanks for sharing that. You were having troubl because the - is a special character in regex. You should therefore have escaped it with \.

                 

                Here is a suggestion:

                 

                <cfset w = "T-cell">

                <cfset regex_w = "T\-cell">

                ...

                <cfset newText = REreplaceNoCase(varName4longtext, regex_w, "<a href=''>#w#</a>", "all")>

                • 5. Re: Any regExp expert here?
                  cp_anil@rediff Level 1

                  In Regex hyphen will be treated as a special character only when it is part of a character class ( that means it is inside square brackets)

                  Still in square brackets also, there's no issue when it is there at the end or beginning of the list of characters.

                   

                  eg. [-abc] and [abc-] - in both of these hyphen  wont be treated as special character.

                   

                  Also no matter whether you are using the long text directly or in a variable , there shouldnt be any difference in the behaviour of ReReplace function.

                  I doubt something else might have caused an error with your first trial with the ReReplace function.

                  • 6. Re: Any regExp expert here?
                    BKBK Adobe Community Professional & MVP

                    cp_anil@rediff wrote:

                     

                    In Regex hyphen will be treated as a special character only when it is part of a character class ( that means it is inside square brackets)

                    Try this, and you will see that it works when you escape the - character:

                     

                    <cfset regex_w = "T\-cell">

                    <cfset text = "Yes, T-cell is a match">

                    <cfset newText = REreplaceNoCase(text, regex_w, "the escaped regex")>

                    <cfoutput>#newtext#</cfoutput>

                     

                     

                    Also no matter whether you are using the long text directly or in a variable , there shouldnt be any difference in the behaviour of ReReplace function.

                     

                    There is a difference in behaviour. ReReplace will match T-cell and T-Cell differently.

                    • 7. Re: Any regExp expert here?
                      ghfftttyyudsderycv76 Level 1

                      No.  The suggestion of <cfset regex_w = "T\-cell"> does not make any sense for as I mentioned, var w could be "T-cell" or it could be anything else that contains one - or more of it, thus, the manual escape won't work.

                       

                      In addition, as mentioned in my follow-up post, I've solved the problem albeit not using regexp technique.

                      • 8. Re: Any regExp expert here?
                        BKBK Adobe Community Professional & MVP

                        There has been a misunderstanding. In your original question you wished to replace the whole word T-cell.

                         

                        Even considering the case where there is an arbitrary number of -, I see a problem. What you mentioned above as part of your code, REreplace("#varName4longtext.", "#w#", "<a href=''>#w#</a>"), is likely to be where things went wrong.

                         

                        You might have been aiming for REreplace(varName4longtext, "#w#", "<a href=''>#w#</a>", "all").