• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

RegEx Help

Explorer ,
Jun 14, 2013 Jun 14, 2013

Copy link to clipboard

Copied

Hello,

I have the following function:

<cffunction name="stripHREFs" access="public" returntype="array" output="no" hint="seperate Links from given HTML string, output as a array">

<cfargument name="html" required="yes">

    <cfset local.startpos = 1>

    <cfset local.list = ArrayNew(1)>

   

    <cfloop condition="local.startpos GREATER THAN 0">

    <cfset local.linkpos = reFindNoCase('<a\b[^>]*>(.*?)</a>',arguments.html,local.startpos,'true')>

    <cfif val(local.linkpos.len[1])>

              <cfset local.startpos = local.linkpos.len[1]+local.linkpos.pos[1]>

              <cfset local.string = mid(arguments.html,local.linkpos.pos[1],local.linkpos.len[1])>

        <cfset local.hrefpos = reFindNoCase('(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+##]*[\w\-\@?^=%&/~\+##])?',local.string,1,'true')>

    <cfif val(local.hrefpos.pos[1])>

                <cfset local.this.a = mid(local.string,local.hrefpos.pos[1],local.hrefpos.len[1])>               

                <cfset local.this.title = reReplacenocase(local.string,'<a\b[^>]*.>',"")>

                <cfset local.this.title = reReplacenocase(local.this.title,'</a*>',"")>

                <cfset ArrayAppend(local.list,local.this)>

                <cfset StructDelete(local,'this')>

            </cfif>

<cfelse>

         <cfbreak>

</cfif>

    </cfloop>

   

<cfreturn local.list>

</cffunction>

It works great, except now my client has decided to include links with an additional attribute called "alias"

The code looks like this, <a href="http://www.acme.com" alias="foo">click me</a>

How can I pull out the "alias" attribute?

TIA

Views

673

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jun 14, 2013 Jun 14, 2013

Copy link to clipboard

Copied

You can include the alias attribute in much the same way you are including the URL and the link title.

For example, to find out if an alias attribute exists, you could add this inside the loop:

<cfset local.alias = REFind('alias="([^"]+)"',local.string,1,true)>

Then, if it exists, add it to the struct before you append it to the array:

<cfif val(local.alias.pos[1])>

    <cfset local.this.alias = REReplace(mid(local.string,local.alias.pos[1],local.alias.len[1]),'(alias=)?"','','ALL')>

</cfif>

The example above make certain assumptions, e.g. the alias attribute is in lowercase, the attribute value is enclosed in doublequotes, etc. You may need to adjust if your client's input does not fit that format.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jun 14, 2013 Jun 14, 2013

Copy link to clipboard

Copied

It almost works but it's outputting this:

local.string = <a href="http://www.google.com" alias="my link alias">learn more</a>

local.aliaspos.pos[1] = 33

local.aliaspos.len[1] = 21

local.this.alias = alias=my link alias

Should be:

local.this.alias = my link alias

Here's the updated code:

<cfloop condition="local.startpos GREATER THAN 0">

          <cfset local.linkpos = reFindNoCase('<a\b[^>]*>(.*?)</a>',variables.html,local.startpos,'true')>

          <cfif val(local.linkpos.len[1])>

                    <cfset local.startpos = local.linkpos.len[1]+local.linkpos.pos[1]>

                    <cfset local.string = mid(variables.html,local.linkpos.pos[1],local.linkpos.len[1])>

                    <cfoutput>local.string = <xmp>#local.string#</xmp><br></cfoutput>

                    <cfset local.hrefpos = reFindNoCase('(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+##]*[\w\-\@?^=%&/~\+##])?',local.string,1,'true')>

                    <cfset local.aliaspos = reFind('alias="([^"]+)"',local.string,1,'true')>

                    <cfoutput>

                              local.aliaspos.pos[1] = #local.aliaspos.pos[1]#<br>

                              local.aliaspos.len[1] = #local.aliaspos.len[1]#<br>

                    </cfoutput><br>

                    <cfif val(local.hrefpos.pos[1])>

                              <cfset local.this.href = mid(local.string,local.hrefpos.pos[1],local.hrefpos.len[1])>

                              <cfif val(local.aliaspos.pos[1])>

                                        <cfset local.this.alias = reReplaceNoCase(mid(local.string,local.aliaspos.pos[1],local.aliaspos.len[1]),'(a lias=)?"','','all')>

                                        <cfoutput>

                                        <p>

                                        local.this.alias = #local.this.alias#

                                        </p>

                                        </cfoutput>

                              </cfif>

                              <cfset local.this.title = reReplaceNoCase(local.string,'<a\b[^>]*.>',"")>

                              <cfset local.this.title = reReplaceNoCase(local.this.title,'</a*>',"")>

                              <cfset ArrayAppend(local.list,local.this)>

                              <cfset StructDelete(local,'this')>

                    </cfif>

                    <cfelse>

                              <cfbreak>

          </cfif>

</cfloop>

I think the REFind() needs a little tweaking so local.aliaspos.pos[1] is 31 and not 33.

Thanks

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jun 19, 2013 Jun 19, 2013

Copy link to clipboard

Copied

LATEST

I ended up replacing all this RegEx non-sense with the awesome http://jsoup.org/ library and everything works like a charm!  Thanks

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation