Copy link to clipboard
Copied
Hello,
I have the following function:
<cffunction name="stripHREFs" access="public" returntype="array" output="no" hint="seperate Links from given HTML string, output as a array">
<cfargument name="html" required="yes">
<cfset local.startpos = 1>
<cfset local.list = ArrayNew(1)>
<cfloop condition="local.startpos GREATER THAN 0">
<cfset local.linkpos = reFindNoCase('<a\b[^>]*>(.*?)</a>',arguments.html,local.startpos,'true')>
<cfif val(local.linkpos.len[1])>
<cfset local.startpos = local.linkpos.len[1]+local.linkpos.pos[1]>
<cfset local.string = mid(arguments.html,local.linkpos.pos[1],local.linkpos.len[1])>
<cfset local.hrefpos = reFindNoCase('(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+##]*[\w\-\@?^=%&/~\+##])?',local.string,1,'true')>
<cfif val(local.hrefpos.pos[1])>
<cfset local.this.a = mid(local.string,local.hrefpos.pos[1],local.hrefpos.len[1])>
<cfset local.this.title = reReplacenocase(local.string,'<a\b[^>]*.>',"")>
<cfset local.this.title = reReplacenocase(local.this.title,'</a*>',"")>
<cfset ArrayAppend(local.list,local.this)>
<cfset StructDelete(local,'this')>
</cfif>
<cfelse>
<cfbreak>
</cfif>
</cfloop>
<cfreturn local.list>
</cffunction>
It works great, except now my client has decided to include links with an additional attribute called "alias"
The code looks like this, <a href="http://www.acme.com" alias="foo">click me</a>
How can I pull out the "alias" attribute?
TIA
Copy link to clipboard
Copied
You can include the alias attribute in much the same way you are including the URL and the link title.
For example, to find out if an alias attribute exists, you could add this inside the loop:
<cfset local.alias = REFind('alias="([^"]+)"',local.string,1,true)>
Then, if it exists, add it to the struct before you append it to the array:
<cfif val(local.alias.pos[1])>
<cfset local.this.alias = REReplace(mid(local.string,local.alias.pos[1],local.alias.len[1]),'(alias=)?"','','ALL')>
</cfif>
The example above make certain assumptions, e.g. the alias attribute is in lowercase, the attribute value is enclosed in doublequotes, etc. You may need to adjust if your client's input does not fit that format.
Copy link to clipboard
Copied
It almost works but it's outputting this:
local.string = <a href="http://www.google.com" alias="my link alias">learn more</a>
local.aliaspos.pos[1] = 33
local.aliaspos.len[1] = 21
local.this.alias = alias=my link alias
Should be:
local.this.alias = my link alias
Here's the updated code:
<cfloop condition="local.startpos GREATER THAN 0">
<cfset local.linkpos = reFindNoCase('<a\b[^>]*>(.*?)</a>',variables.html,local.startpos,'true')>
<cfif val(local.linkpos.len[1])>
<cfset local.startpos = local.linkpos.len[1]+local.linkpos.pos[1]>
<cfset local.string = mid(variables.html,local.linkpos.pos[1],local.linkpos.len[1])>
<cfoutput>local.string = <xmp>#local.string#</xmp><br></cfoutput>
<cfset local.hrefpos = reFindNoCase('(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+##]*[\w\-\@?^=%&/~\+##])?',local.string,1,'true')>
<cfset local.aliaspos = reFind('alias="([^"]+)"',local.string,1,'true')>
<cfoutput>
local.aliaspos.pos[1] = #local.aliaspos.pos[1]#<br>
local.aliaspos.len[1] = #local.aliaspos.len[1]#<br>
</cfoutput><br>
<cfif val(local.hrefpos.pos[1])>
<cfset local.this.href = mid(local.string,local.hrefpos.pos[1],local.hrefpos.len[1])>
<cfif val(local.aliaspos.pos[1])>
<cfset local.this.alias = reReplaceNoCase(mid(local.string,local.aliaspos.pos[1],local.aliaspos.len[1]),'(a lias=)?"','','all')>
<cfoutput>
<p>
local.this.alias = #local.this.alias#
</p>
</cfoutput>
</cfif>
<cfset local.this.title = reReplaceNoCase(local.string,'<a\b[^>]*.>',"")>
<cfset local.this.title = reReplaceNoCase(local.this.title,'</a*>',"")>
<cfset ArrayAppend(local.list,local.this)>
<cfset StructDelete(local,'this')>
</cfif>
<cfelse>
<cfbreak>
</cfif>
</cfloop>
I think the REFind() needs a little tweaking so local.aliaspos.pos[1] is 31 and not 33.
Thanks
Copy link to clipboard
Copied
I ended up replacing all this RegEx non-sense with the awesome http://jsoup.org/ library and everything works like a charm! Thanks