9 Replies Latest reply on May 20, 2009 12:58 PM by ilssac

    RegEx Help

    Balance

      I have a string that looks like this:

       

      <DIV id="<:HTMLContent1:>">
      <P><A style="FONT: bold 13px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none" href="http://192.168.1.1/e.cfm?m=<:cid:>.6.0.45" target=_blank>Free tutorials</A> <BR><SPAN style="FONT: 11px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none">This month's free online tutorials cover new techniques</SPAN>.</P></DIV>

       

      which I need to parse and make look like this:

       

      <:HTMLContent1:>

       

      The thing is, it could be from HTMLContent1 through HTMLContent8, so the RegEx needs to account for that.

      Basically, I need to just pull the ID value of the div tag.

       

      TIA!

        • 1. Re: RegEx Help
          Balance Level 1

          anyone?

          • 2. Re: RegEx Help
            ilssac Level 5
            HTMLContent?

             

            The question mark is the single character wild card control in regular expersion syntax.  This will match the String "HTMLContent" and any other character.  If you want to restrict the final character to explicitly the characters 1 through 8.

            HTMLContent[1-8]
            
            • 3. Re: RegEx Help
              Balance Level 1

              Ian,

               

              Thanks, but do you have any thoughts on how I can parse out only the <:HTMLContent[1-8]:> portion from that string?

              • 4. Re: RegEx Help
                ilssac Level 5

                Umm, just put the regex into a refind() function!

                refind("HTMLContent[1-8]",contentStringToSearch...)
                

                You will probably want to read the documentation on the refind() function to determine whether you want to set the retunSubExpressions property to true which returns an array of pos and len of matches, or false which just returns the start position.

                 

                It looks like you also want the angle brackes and colons in your return, so that would be:

                refind("<:HTMLContent[1-8]:>"....)
                
                • 5. Re: RegEx Help
                  Balance Level 1

                  I was hoping someone could help me with one REReplaceNoCase statement to replace all instances in one shot.

                  RegEx is like chinese to me...ggrrrrrr

                   

                  thnx

                  • 6. Re: RegEx Help
                    craigkaminsky Level 3

                    Balance,

                    Ian actually had the base of it all there for you !

                     

                    ReReplaceNoCase( "HTMLContent[1-8]", contentStringToSearch, "string to replace", "ALL")

                     

                    This says, find the string HTMLContent followed by a digit 1-8 in the variable contentStringToSearch. Replace all of the instances found with the "string to replace". Do you need more than that?

                     

                    Cheers,

                    Craig

                    • 7. Re: RegEx Help
                      ilssac Level 5

                      I think the OP wants to reduce the entire div tag down to the HTMLContent string....  Which would look close to something like this.

                      ReReplaceNoCase( '<div[^>]*(<:HTMLContent[1-8]:>)">*?</div>', contentStringToSearch, "\1", "ALL")
                      
                      • 8. Re: RegEx Help
                        Balance Level 1

                        Ian,

                         

                        Thanks, but that RegEx didn't change anything (even after I switched the parameters, since you had them backwards).

                        Here's what the entire string looks like:

                        <DIV id="<:HTMLContent1:>">
                        <P><A style="FONT: bold 13px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none" href="http://192.168.1.1/d.cfm?m=19866" target=_blank>Free tutorials</A> <BR><SPAN style="FONT: 11px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none">This month's free on line tutorials</SPAN>.</P></DIV>
                        <DIV id="<:HTMLContent2:>">
                        <P><A style="FONT: bold 13px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none" href="http://192.168.1.1/d.cfm?m=19867" target=_blank>3G Evolution</A><BR><SPAN style="FONT: 11px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none">A comprehensive understanding of LTE design</SPAN></P></DIV>
                        <DIV id="<:HTMLContent3:>">
                        <P><A style="FONT: bold 13px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none" href="http://192.168.1.1/d.cfm?m=19870" target=_blank>New Title!</A> <BR><SPAN style="FONT: 11px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none">Broadcasting Standards covers basic principles</SPAN></P></DIV>
                        
                        

                         

                        Here's what the RegEx looks like:

                        <cfscript>
                        variables.content=ReReplaceNoCase(variables.content,'<div[^>]*(<:HTMLContent[1-8]:>)">*?</div>',"\1", "ALL");
                        </cfscript>
                        

                         

                        When I output variables.content after the RegEx it looks identical to the original text.

                         

                        Thanks again.

                        • 9. Re: RegEx Help
                          ilssac Level 5

                          A check of a Regex syntax guide would probably have pointed out that I forgot one little dot character.

                           

                          <cfscript>
                          variables.content = ReReplaceNoCase(variables.content,'<div[^>]*(<:HTMLContent[1-8]:>).*?</div>',"\1", "ALL");
                          </cfscript>