This content has been marked as final. Show 5 replies
Just do it in two passes.
See the attached.
Your script gave the exact same result, however I think I realize what's going on now. The server I'm using is CF5. When I run my script (yours too) on MX7 it works.
BTW, I don't think there is any need to run two passes to strip the HTML in attempt to find a match. If the regex doesn't find a match it wouldn't do any good to strip the HTML off the first result (i.e. sLInkEssentials -> sLinksNoMarkup).
Is there known issues and workarounds with regex's on CF5?
The construct (.*?) looks very strange for a regular expression.
. - match any character
* match 0 or more times
? match 0 or 1 times.
Remember that the ? character has special meaning in regular expressions. I think in your case you can leave it out altogether unless you are trying to match an actual question mark, in which case you would escape it "\?"
If you carefully run the code I supplied you'll see that it does not give the same result as your regex.
Yours left the title html untouched and also returned extra garbage like target="_blank".
Anyway, I no longer support CF5 unless someone pays me. You might start a new thread making it clear that you are using CF5.
I strip the HTML before I dump it in the database, however that is cosmetic only and has zero effect on the actual matching of either regular expressions, which is what the problem is.
Your regex works very nice, and I'll be implementing a modified version to do the trick (I can't assume the attribute values are always enclosed with quotes).
*no longer support CF5 unless someone pays me...* Must be nice :)
Healey_Mark, thanks for the reply.
The .*? makes a "lazy star" match, matching the minimum possible match. This means it will match until the next step in the expression. For example: <a[^>]*>(.*?)</a> would return everything between each pair of A tags (<a></a>).
If I used .* it would make a greedy match of every printable character from that point to the end of the file.