This content has been marked as final. Show 8 replies
Try the cgi.http_referrer variable.
The problem you are going to run into is that most web servers take a request for http://www.site.com or http://www.site.com/ and turn that into a 302 for the index document as specified in the web config, i.e.
- client requets http://www.site.com from server
- server sends a http 302 -> client = http://www.site.com/index.cfm
- client sends an new http 200 -> http://www.site.com/index.cfm
The reason you have the differentiation in your web server log is that each of the requests above is sometimes a separate line looking something like this:
220.127.116.11 - - [01/Apr/2006:01:01:01 GMT] "GET / HTTP/1.1" 302 941272
18.104.22.168 - - [01/Apr/2006:01:01:01 GMT] "GET /index.cfm HTTP/1.1" 200 941286
Now, the CGI.HTTP_REFERER variable that Dan mentioned will tell you about requests that *probably* don't have index.cfm in the request header, but it won't account for bookmarks and referers from https (b/c that variable isn't passed on requests from https -> http). If you can live with those exceptions, then Dan's suggestion is definitely the way to go.
Thanks for the reply - sorry to not get back to you.
I still have the problem that the CGI.HTTP_REFERER does not allow me to
differentiate between www.site.com and www.site.com/index.cfm to allow
me to redirect the second to the first without making an infinite loop.
This is being used for SEO purposes to stop google listing these as 2
separate pages - they really do make life complicated!!
I won't ask why you are doing such a, ah, interesting exercise.
If you are doing the aforementioned redirection, set a cookie, session, or URL variable and then you can differentiate between normal redirects and your own.
I have to agree with MikerRoo about the "interesting exercise" part ... But if you just need an easy way to deal with Google and this has nothing to do with tracking users, why not just do some special handling for the googlebot in Application.cfm, whatever handling that might be?
I'm back on this same issue again and can't believe I haven't found a
solution. I can isolate google, but I still have the infinite loop
problem to solve once isolated. If there is no way to differentiate, CF
cannot tell if index.cfm is included in the URL or not. It will
therefore always try to redirect....
Not sure if it will work the same on every web server, but try
Just so that you know.... A page that redirects to another page automatically, is severly 'frowned on' by Search Engines, thus you may have the result of the entire site being ignored rather than forcing a specific page to show up in the search results.
Have you researched 'robots.txt'? You may be able to put some code in that file that instructs the google spider on what page to 'start with'.
Hope this helps