2 Replies Latest reply on Jun 24, 2009 10:33 PM by Ratsnackbar

    Website Statistics in Coldfusion

    josheby
      In the past I have been using sites such as sitemeter.com and other free statistics sites to help me monitor site usage. I am now working on a way to do this all via coldfusion and some code that is processed at the end of each request. My reason for wanting to do this in coldfusion is that I can then track the values of a couple of important variables, plus use the information to directly impact my site by referencing the most popular information and such. I am looking for peoples opinions and suggestions on this.

      First of all, using free solutions such as the one I mentioned earlier does not have any performance effects on my application as it is not running no my server. Does anyone think that I should be concerned with causing performance issues by doing this on my server via coldfusion? Or are there things I should try to avoid to reduce this issue. For the most part, most of my sites will not be getting the traffic levels that I think it would take for this to cause issues with performance, but I would like to plan ahead for this just incase.

      Secondly, I am concerned about getting false results do to search engine bots. Using the free sites this is not an issue as most of them include the image (counter)_via a javascript call. Coldfusion is going to see their request as it would any other request. Does anyone know of a way I can distinguish a bot vs a human in Coldfusion? I am expecting this to be the hardest part.

      Last of all, does anyone have any good resources on this matter that they would be willing to share? I have found a couple via google searches, but seem to be having trouble finding valuable information.

      Thanks for everyones help. It will be greatly appreciated! Thanks!
        • 1. Re: Website Statistics in Coldfusion
          Grizzly9279 Level 1
          It sounds like you'd be much better off if you invested in a real web analytics service.

          See Google Analytics:
          http://www.google.com/analytics/

          Or perhaps Omniture:
          http://www.omniture.com/

          I know some (if not all) of these services allow you to track custom/dynamic variables (related to conversion trackng, etc)

          You could do it yourself in ColdFusion, but it sounds like you already have a good sense of the caveats. (Performance, false positives w/ bots, etc) Odds are you're going to end up spending a ton of time trying to "roll your own" on this, and as we all know, time is money. If it were me, I'd save myself the time and frustration and put my trust into a well established analytics provider, even if it cost me a few bucks.

          My two cents...
          • 2. Re: Website Statistics in Coldfusion
            Ratsnackbar Level 2

            I would not suggest having ColdFusion track site statistics within it's own code.  There could be performance issues with this.  What I would suggest is that you consider using a free or commercial vendor that can provide you access into your analyzed data.  Then pass the important statistics you require to the analytics server via Meta Tags and acquire your data through service calls to the services API.  Persist the relevant data in a local database and draw your conclusions from it.  This will allow you to use both hit based metrics but more importantly also the analyzed Visitor based metrics which really are what you want.  You can then let the analytics continue using regular means and utilize the data from previous analysis to influance your sites design programatically.  To do this you need to choose an appropriate vendor for your analytics which can provide the options required.  From my experience I know for sure you can do this with Webtrends and I think you can with Google Analytics and Omniture.

             

            In particular I would look into Webtrends REST API at http://developer.webtrends.com

             

            Not sure if this is relevant but I had a friend ask me about choosing each, Speed considerations and other things that would influence his decision on which to choose.  Much of this is specific to his query.  I am including what I wrote for him in case it may be of use but a great deal is irrelevant.  The spider and bot stuff might be though.

             

            -Joe

             

            Regarding JavaScript Tag Speed:

            -----------------------------------------------

             

            If you are talking standard HTML pages then pretty much all of the Web Analytics solutions all use much the same JavaScript Tagging techniques.

             

            In most cases JavaScript tag you apply to your page Dynamically generates an image call to the appropriate logging server.  The image itself is not important but with the call will be included the data that you want to have logged.  This is most often done by including the data as query parameters on the image request though I have seen some funky JSON implementations via AJAX.

             

            The Data collection server identifies where to place the log data.  The query parameters are stripped off the image request and used to generate the log file which will later be analyzed.  Realtime (or near realtime) statistics are collected at this point and are generally only available for Hit based Stats as Visit metrics require the close of the visit to analyze.

             

            There is not much of a speed difference between the different vendors JavaScript tags.  Or at least not enough to consider it worth worrying about.  For example the Webtrends tag (with all options included) loads in under 50 milliseconds.  Not enough for a user of your site to notice it.  So don't worry about tag speed when choosing a service partner or software.

             

            Regarding Spiders, Bots and Automated systems:

            -------------------------------------------------------------------------

             

            There is no perfect way to remove all spider and bot traffic from log statistics.  On average for a non commercial site at least 30% of the traffic in your standard WebLog files will be generated by Spiders and Bots.  On a commercial site or if you are a government agency then this can skyrocket to 60% or more as these are often prime targets for automated systems intent on finding a penetration point.

             

            Now that is standard log files.  I you are implementing JavaScript Tags to tag your site then the traffic details will be much more reasonable.  Most known spiders, bots and automated systems will not trigger a JavaScript call (though they could trigger an image request in a  <noscript> tag).  So in general simply by tagging your sites with a JavaScript tag mechanism and analyzing that data you can generally strip out enough of the spider and bot traffic to no longer have to worry about it that much.

             

            (If it is mission critical to have absolutely NO spider, bot or automated system details in your log file you would want to implement an In House solution where the Log files are readily available to you and in a format that a Log Scrubber could be utilized to analyze and remove hits based on the User Agent string and rules you provide.  Webtrends Software solutions would be a good choice for this.)

             

            If on the other hand you would like your ColdFusion server to read the User Agent string and switch between provide the JavaScript tag or not it WILL add a great deal of extra overhead to your sites execution time.  It is not worth it.

             

            Also, unless your site is required to not use JavaScript (as some government sites are) there is no ligitimate reason to not use a JavaScript tag.

             

            Regarding where to Collect Data:

            -----------------------------------------------

             

            So DO let your CF Server hosting/generating your sites include JavaScript tags.  DO NOT bother trying to decide who to server the tags too.  Just send them with all requests.

             

            What you DO NOT want to do have that CF Server hosting your site(s) be the same server that collects the returned data.  That would be a very big big big NO NO for many reasons I will not go into here.  Just trust me.  Don't do it.  If your collection occurs in house, that server needs to be it's own stand alone system or systems which use Round Robin for load balancing.

             

            If you are using an On Demand service such as those provided by Webtrends, Google Analytics or Omniture then this is not a problem.  The Data Collector happens on their server farms.

             

            Regarding which to use In House or Software:

            ------------------------------------------------------------------

             

            If you want the ultimate in control and configurability, have the budget and don't mind the added responsibility you perhaps want to consider an In House software solution.  This is more costly in terms of Resources and Time but allows you full control over every aspect of your solution.  This pretty much rules out Google Analytics as I do not believe they have a Software solution at this time.  Webtrends and Omniture both offer in house solutions.  These are ideal for Intranets in the corporate environment.  I personally would use Webtrends but that is simply because I am more familiar with their software solution.  Look around and make your own choice.  Most vendors will offer free trials you can play with.

             

            Regarding On Demand Services:

            -----------------------------------------------

             

            Which On Demand service to choose depends greatly on your level of traffic and the skill of your team who will be implementing the solution (perhaps just you).  I have had a lot of experience tagging sites with Webtrends, Some with Google Analytics and only a little with Omniture and minimal other solutions.  Below are simply my opinions on each.

             

            Google Analytics:

             

            Pros:  They are great for smaller sites with less then 5 million hits a month.  They have some very well formed tools and great documentation.  So if your traffic level is less then this I would suggest using GA.

             

            Cons:  What they do not have is a support team that will provide as comprehensive of a support service as some of the commercial vendors.  If you want great support you generally have to pay for it. 

             

            Also Google is not really set up to handle very large capacitiy sites.  For Example:  According to Orbitz they sent Webtrends 1 Billion hits in one day and Webtrends happily accepted it and analyzed it without problems.  I do not think you would be able to do that with GA.

             

            So if you are certain that growt on the site over the next year will exceed 5 million page views per month AND budget is not the primary concern, you will want to seek a commercial vendor.

             

            Of the commercial solutions the two big boys are Omniture and Webtrends.

             

            Webtrends:

             

            Pros:  Webtrends solutions are the most powerful over all.  They certianly do not have capacity problems and you can simply do more with their solutions if you know what you are doing.  You will pay more for their solutions in the short term but their packages are more complete.  They offer a greater amount of options to acquire your data but analyzed and raw outside of their User Interface then most vendors do.

             

            Cons:  Their User Interface leaves much to be desired.  (They are working on that though and are designing ways for you to build your own UI if you do not like theirs.  Hopefully this will materialize soon.).

             

            Omnitures:

             

            Pros: UI at the moment is better thought out and easier to get around in for a beginner with less training or experience.  Their solution will be less expensive at first and is reasonable powerful.

             

            Cons:  The starting price seems low but they charge you for every feature you would want to add.  This is sort of like how the PC Vendors will sucker you in with a low priced PC's but then once start adding extra options the price skyrockets.

             

            Word of warning:  When pricing any vendors solution make sure the quote includes everything you require AND everthing you could want.  Then if the price is too high but you still want to use their service start removing things that are not absolutely required.  If you can still live with it add it to the list of possibilities.  Otherwise drop it, don't consider it anymore and provide the vendor feedback regarding why.