Copy link to clipboard
Copied
I posted it originally in the Dreamweaver forum thinking it was a problem with the application. They say differently.
Copy link to clipboard
Copied
<cfprocessingdirective pageencoding="utf-8">
Andy
Copy link to clipboard
Copied
To expand slightly on what Andy says, if you have anything other than just ASCII in a CFM template, you need to tell the CF compiler. And that's what that directive does: it tells the CF compiler than the file is UTF-8, not just ASCII. Read the docs on it though.
--
Adam
Copy link to clipboard
Copied
So, not wanting to use that directive, I need to find a way to get from MS Word into Dreamweaver, and ensure the contents of the code are ascii only.
So why doesn't Dreamweaver perform a conversion of incoming non-ascii text to ascii ?
Copy link to clipboard
Copied
There is an option under the Command menu to Clean up Word HTML... not sure
if that helps ...
Copy link to clipboard
Copied
So, not wanting to use that directive, I need to find a way to get from MS Word into Dreamweaver, and ensure the contents of the code are ascii only.
I dunno whether that's going to be possible. Some of the characters that Word will use simply aren't ASCII characters. Curly quotes for example: there is no ASCII equivalent. And you can't expect any sort of automated process to work: just because both opening and closing curly quotes look pretty much like " to you and I, doesn't mean that there is any automatic corelation between the curly ones and the ASCII one.
The best approach is probably to save the text in Word AS text (ie: Save As... *.txt), and then grab the text from the text file, not Word. The best way to convert something that Word creates into some other format will be to use Word to do it, after all.
So why doesn't Dreamweaver perform a conversion of incoming non-ascii text to ascii ?
Why should it? It is completely valid to have any character encoding one likes in a web page, and DW is for creating webpages. The difference is that CFM files are not for creating web pages (although the results of running the CFM files might well be), they're for holding CFML code, which needs to be processed by the compiler; and compilers are less forgiving with characters than our eyes are. For reasons best known to Java (and I think it's a Java thing, not specifically a CF thing, although don't quote me on that), It needs to be actively told what the encoding of its source code files are, rather than checking for a BOM or simply inferring it like most applications these days are quite capable of doing.
One question I've got is why you are putting stuff created in Word (which sounds like content, not code) in your CFM files in the first place?
--
Adam
Copy link to clipboard
Copied
Why am I putting content into CFM files ? Dunno - some pages have the odd paragraph of content I suppose.
Anyway, I understand the problem now, so I can take steps to avoid it.
Copy link to clipboard
Copied
Everyone who's answered so far is on the right track, but I think there's one extra little twist. I suspect that you're copying ISO-8859-1 content and CF is trying to serve it as UTF-8. I believe UTF-8 is the default encoding for modern versions of CF. So, you might have to specify the region-specific encoding you're copying from (again, I suspect ISO-8859-1) instead of UTF-8.
As for why you're having this problem, it's not Dreamweaver's fault, or Word's fault, or CF's fault. It's the fact that you can specify different encodings every step of the way, and there's no automatic way for the computer to infer what you really want to use.
Read this:
http://www.joelonsoftware.com/articles/Unicode.html
Dave Watts, CTO, Fig Leaf Software
http://www.figleaf.com/
http://training.figleaf.com/
Copy link to clipboard
Copied
If you go into CFADMIN (/CFIDE/administrator) click Settings Summary
You'll see Java File Encoding this will tell you what the default encoding is.
Now as Andy says you can put that on the top of every page, (IIRC you can't do it in Application.cfm it has to be the top of the actual file)
Or if you have access to edit your JVM config you can add this:
-Dfile.encoding=utf8
We've done this with success, and no need of using the processing directive.
HTH
More info about CF and UTF but it's quite old is here: http://www.thickpaddy.com/2009/8/10/coldfusion-is-not-utf-8-encoded