Skip navigation
Mike Given
Currently Being Moderated

Parsing regex from an ".ini" file

Apr 19, 2010 10:56 AM

Hey folks,

My first post to CF fora here - not sure if this is any of a) appropriate spot or even b) appropriate question.


I have a routine that works but its awfully slow considering that it needs to be called at the beginning of every session - I'm parsing through a fairly large .ini file for browser detection from application.cfc - would appreciate any insights if anyone sees anything I'm just hosing up badly.  Obviously it mayn't be answerable to do idiosyncrasies of the browscap.ini file (that're tedious to go through).

 

<!--- 
  **********************************************************************
  Fetch user agent information
  **********************************************************************
  Important: This routine depends largely on keeping the browser
  capability file up-to-date.  Updates are currently available from:
  http://browsers.garykeith.com/stream.asp?BrowsCapINI
  ---------------------------------------------------------------------- --->
<cffunction name="getBrowserInfo">
  <cfscript>
    // Set location of browscap.ini ----------------------------
    browscap_ini = expandPath("./browscap.ini");
    // Read wildcard patterns from the INI file  ---------------
    browscap_list = getProfileSections(browscap_ini);
    // Seed some variables -------------------------------------
    browser_champion_pattern = "*";
    browser_champion_regex = "^.*$";
    default_id = "*";
  </cfscript>
  <cfloop list="#browscap_list[default_id]#" index="keyname">
    <cfscript>
      xvalue = getProfileString(browscap_ini, default_id, keyname);
      if (keyname neq "parent") { browscap['#keyname#'] = xvalue; }
    </cfscript>
  </cfloop>
  <cfscript>
    // Loop through the patterns to find the best match --------
    for (browscap.browser_name_pattern in browscap_list) {
      // Massage the wildcard into useable regex ---------------
      browscap.browser_name_regex = lCase(browscap.browser_name_pattern);
      browscap.browser_name_regex = replace(browscap.browser_name_regex, ".", "\.", "all");
      browscap.browser_name_regex = replace(browscap.browser_name_regex, "*", ".*", "all");
      browscap.browser_name_regex = replace(browscap.browser_name_regex, "?", ".", "all");
      browscap.browser_name_regex = replace(browscap.browser_name_regex, "(", "\(", "all");
      browscap.browser_name_regex = replace(browscap.browser_name_regex, ")", "\)", "all");
      browscap.browser_name_regex = replace(browscap.browser_name_regex, "[", "\[", "all");
      browscap.browser_name_regex = replace(browscap.browser_name_regex, "]", "\]", "all");
      if (right(browscap.browser_name_regex, 1) eq "*") {
        browscap.browser_name_regex = browscap.browser_name_regex & "$"; }
      browscap.browser_name_regex = "^" & browscap.browser_name_regex;
      // Test the resulting regex against the user agent -------
      if (isValid("regular_expression", lCase(CGI.HTTP_USER_AGENT), browscap.browser_name_regex)) {
        // User agent matches regex so we got a challenger -----
        if (len(browscap.browser_name_pattern) ge len(browser_champion_pattern)) {
          // If challenger is longer than champ then we got a new champ ----
          browser_champion_pattern = browscap.browser_name_pattern;
          browser_champion_regex = browscap.browser_name_regex; } } }
    // Set the winning regex patterns --------------------------
    browscap.browser_name_pattern = browser_champion_pattern;
    browscap.browser_name_regex = browser_champion_regex;
  </cfscript>
  <!--- Check for a living parent record ----------------------- --->
  <cfif (len(getProfileString(browscap_ini, browscap.browser_name_pattern, "parent")) gt 0)>
    <cfset parent_id = getProfileString(browscap_ini, browscap.browser_name_pattern, "parent")>
    <!--- Fetch the parental info ------------------------------ --->
    <cfloop list="#browscap_list[parent_id]#" index="keyname">
      <cfscript>
        xvalue = getProfileString(browscap_ini, parent_id, keyname);
        if (keyname neq "parent") { Session['agent_#keyname#'] = xvalue; }
      </cfscript>
    </cfloop>
  </cfif>
  <!--- Fetch the winning info --------------------------------- --->
  <cfloop list="#browscap_list[browser_champion_pattern]#" index="keyname">
    <cfscript>
      xvalue = getProfileString(browscap_ini, browser_champion_pattern, keyname);
      if (keyname neq "parent") { Session['agent_#keyname#'] = xvalue; }
    </cfscript>
  </cfloop>
</cffunction>
 
Replies
  • Currently Being Moderated
    Apr 19, 2010 11:15 AM   in reply to Mike Given

    I'm affraid I must say that this code is too dense for me to easily grok.

     

    But I the first thing I note is that you only showed the custom getBrowserInfo() function.

     

    How is this function getting called?

     

    I would want to make sure it is not getting called more often then necessary.

     

    Also are you reading and parsing the file every time it is called?  Is that necessary if so?  Could it be called less often and stroed in memory in some manner?  The would lessen the number of file I/O calls being used which usually is one of the more costly actions.

     
    |
    Mark as:
  • Currently Being Moderated
    Apr 19, 2010 12:38 PM   in reply to Mike Given

    Mike Given wrote:

     

    Yup, hafta parse on each call; the routine basicly has a look at the CGI.HTTP_USER_AGENT variable and checks the browser's capabilities

     

    Yes, but does the INI file need to be loaded and parsed for the regex every time the CGI.HTTP_USER_AGENT is checked?  Personnaly, I would probably look at a caching idea where the INI file is loaded and parsed into it's set of regex tests and store this in memory.  Then the CGI.HTTP_USER_AGENT could be checked against the data in memory, instead of getting it from the file system.

     

    Secondly.  The idea would be to see if you need to check against every regex every time?  Is there some place where you can short cut the looping once you have determined what you want to determine from the checking?

     
    |
    Mark as:
  • Currently Being Moderated
    Apr 19, 2010 12:44 PM   in reply to ilssac

    Secondly.  The idea would be to see if you need to check

    against every regex every time?  Is there some place

    where you can short cut the looping once you have determined

    what you want to determine from the checking?

     

    Good point.

     
    |
    Mark as:
  • Currently Being Moderated
    Jun 28, 2012 3:44 PM   in reply to Mike Given

    There is a lot you could do to speed this up - specifically - reading in the INI file once and parsing it into a structure. That would remove the File IO and parsing time.

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points