RegEx Help

Report · May 18, 2009

I have a string that looks like this:

<DIV id="<:HTMLContent1:>">
<P><A style="FONT: bold 13px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none" href="http://192.168.1.1/e.cfm?m=<:cid:>.6.0.45" target=_blank>Free tutorials</A> <BR><SPAN style="FONT: 11px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none">This month's free online tutorials cover new techniques</SPAN>.</P></DIV>

which I need to parse and make look like this:

<:HTMLContent1:>

The thing is, it could be from HTMLContent1 through HTMLContent8, so the RegEx needs to account for that.

Basically, I need to just pull the ID value of the div tag.

TIA!

Report · May 19, 2009

anyone?

Report · May 19, 2009

HTMLContent?

The question mark is the single character wild card control in regular expersion syntax. This will match the String "HTMLContent" and any other character. If you want to restrict the final character to explicitly the characters 1 through 8.

HTMLContent[1-8]

Report · May 19, 2009

Ian,

Thanks, but do you have any thoughts on how I can parse out only the <:HTMLContent[1-8]:> portion from that string?

Report · May 19, 2009

Umm, just put the regex into a refind() function!

refind("HTMLContent[1-8]",contentStringToSearch...)

You will probably want to read the documentation on the refind() function to determine whether you want to set the retunSubExpressions property to true which returns an array of pos and len of matches, or false which just returns the start position.

It looks like you also want the angle brackes and colons in your return, so that would be:

refind("<:HTMLContent[1-8]:>"....)

Report · May 19, 2009

I was hoping someone could help me with one REReplaceNoCase statement to replace all instances in one shot.

RegEx is like chinese to me...ggrrrrrr

thnx

Report · May 20, 2009

Balance,

Ian actually had the base of it all there for you !

ReReplaceNoCase( "HTMLContent[1-8]", contentStringToSearch, "string to replace", "ALL")

This says, find the string HTMLContent followed by a digit 1-8 in the variable contentStringToSearch. Replace all of the instances found with the "string to replace". Do you need more than that?

Cheers,

Craig

Report · May 20, 2009

I think the OP wants to reduce the entire div tag down to the HTMLContent string.... Which would look close to something like this.

ReReplaceNoCase( '<div[^>]*(<:HTMLContent[1-8]:>)">*?</div>', contentStringToSearch, "\1", "ALL")

Report · May 20, 2009

Ian,

Thanks, but that RegEx didn't change anything (even after I switched the parameters, since you had them backwards).

Here's what the entire string looks like:

<DIV id="<:HTMLContent1:>">
<P><A style="FONT: bold 13px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none" href="http://192.168.1.1/d.cfm?m=19866" target=_blank>Free tutorials</A> <BR><SPAN style="FONT: 11px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none">This month's free on line tutorials</SPAN>.</P></DIV>
<DIV id="<:HTMLContent2:>">
<P><A style="FONT: bold 13px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none" href="http://192.168.1.1/d.cfm?m=19867" target=_blank>3G Evolution</A><BR><SPAN style="FONT: 11px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none">A comprehensive understanding of LTE design</SPAN></P></DIV>
<DIV id="<:HTMLContent3:>">
<P><A style="FONT: bold 13px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none" href="http://192.168.1.1/d.cfm?m=19870" target=_blank>New Title!</A> <BR><SPAN style="FONT: 11px Arial, Helvetica, sans-serif; COLOR: #000000; TEXT-DECORATION: none">Broadcasting Standards covers basic principles</SPAN></P></DIV>

Here's what the RegEx looks like:

<cfscript>
variables.content=ReReplaceNoCase(variables.content,'<div[^>]*(<:HTMLContent[1-8]:>)">*?</div>',"\1", "ALL");
</cfscript>

When I output variables.content after the RegEx it looks identical to the original text.

Thanks again.

Report · May 20, 2009

A check of a Regex syntax guide would probably have pointed out that I forgot one little dot character.

<cfscript>
variables.content = ReReplaceNoCase(variables.content,'<div[^>]*(<:HTMLContent[1-8]:>).*?</div>',"\1", "ALL");
</cfscript>

Adobe Community

RegEx Help