First, cfscript and script are completely different. The former runs cfml on the server and the latter runs js or something on the client.
Reading every file and looking for script tags seems pretty simple. cfdirectory gets you started. cffile is next. Finally, there is that handy dandy contains function in cf.
use the <cfloop> function to open a file and loop
through it line by line so that you could process this file
in CF 8
You can do that in earlier versions as well. It just requires a few extra steps. Read the file into memory. Then loop through it like a list, delimited by new line characters (ie chr(10) and/or chr(13)). It is slightly less efficient that the CF8 method, as you are reading the whole file into memory at once. But the overall concept is the same.
You may want to check out http://portcullis.riaforge.org/ it is an open source project that will scan any number of security vulnerabilities. It doesn't directly scan files but I am sure you can use that code to create a new function and pass in the content of the file to it.
Thanks for the suggestion. I've already adapted code within Portcullis to build a URL Validator; i.e., detect semicolons and < and > that have been inserteing in an URL. I suppose I could work something out, but I think combining <cfloop> using the CF8 file read mode and the cf REreplace and FindNoCase functions will enable me to do what I need to do -- detect the presence of all possible variants on "<script>" burried within a text file. While the cffile option would probably work with small text files, it wouldn't be a satisfactory solution for files larger than probably 100k. The bottom line will have to be that I just go forward using CF8 & CF9 and forget about backward compatibility.
Thanks to everyone for your suggestions.
line will have to be that I just go forward using CF8 &
CF9 and forget about backward compatibility.
You can get similar functionality under MX7 using a bit of java. But it is strictly do it yourself. It is not difficult, but it would require createObject() access.
You are basically trying to prevent cross site scripting and it is a little more complicated then simply looking for the <script> tags. There are a few tags that can be tweaked to produce the same result. Here are a couple examples starting with your original example:
<a href="#" onmouseover="http://badsite.com">Whatever you do, don't put the mouse here!</a>
There are others. I had to write my own CFX tag to filter this crap. My code was based on the Legitima HTML Parser (http://www.legitima.com) but I have no idea if they are even in business anymore (I'm looking at comments in code that is almost 10 years old).
Anyway, I only mention this so you will realize that the project might be much bigger than you think. I would try Googling "coldfusion safe html filter" or other variations. I would not be surprised if there is a JAVA component out there that does this sort of stuff.
Also keep in mind that depending on how the data is presented, it may be possible to URL or HTML encode the < and > signs and the net result might unwanted execution of script code.
You're right when you way that there are always going to be new things out there. Since I can barely spell "xss", I'm deeply concerned that some new thing is going to pop that I have no idea about and since I have truly taken a simplistic view of this matter, I'll look into the lagitma.com offering. Would you be willing to share the CFX tag that you wrote?
Onto a more pressing matter. I have written a CFC that parses the file in my simplistic fashion and returns either pass or fail. If it fails, I use a <cfile> delete to remove the file from the server where it has been uploaded using <cfile>. This call fails because the file "is locked by another user" for some amount of time, which I haven't been able to determine. I'm assuming that CF (<cfloop>) is holding onto it, but I don't know that.
I have not been able to find a complete description of the <cfloop> file function -- not in Forta's CF8 book or online. I have found bits and pieces things such as "From" and "Char" methods that exist. I don't know whether these are undocumented feathers or if they came out after the book was published. So here is my question. What I’m wondering is if there is a way to close the file within the <cfloop> so that the file will be released as soon as the <cfloop> finishes.
Thanks again to all of your who have pitched in here.