• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

cffile and UTF-8

Participant ,
Mar 25, 2009 Mar 25, 2009

Copy link to clipboard

Copied

Hello Community!

I have a program that uploads a file to a remote FTP server. I am using cffile to write the file there and it MUST be uploaded in UTF-8 format. Despite that, the file is being uploaded as ascii or ansi, anything except UTF-8.

This is my line of code:
<cffile action="write" file="#f_dir##f_name#" output="#dataHeader#" charset="utf-8">

charset="utf-8" is not working for me.

Does anybody else have the same problem? Any thoughts?

Thanks!

Ysais.
TOPICS
Advanced techniques

Views

4.0K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 25, 2009 Mar 25, 2009

Copy link to clipboard

Copied

apocalipsis19 wrote:
> Hello Community!
>
> I have a program that uploads a file to a remote FTP server. I am using cffile
> to write the file there and it MUST be uploaded in UTF-8 format. Despite that,
> the file is being uploaded as ascii or ansi, anything except UTF-8.

CFFILE is independent from CFFTP (I'm assuming you're using CFFTP to
upload to the remote server). I think you're searching for the
transferMode attribute of the CFFTP tag.

--
Mack

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 25, 2009 Mar 25, 2009

Copy link to clipboard

Copied

> charset="utf-8" is not working for me.

What version of CF? There was a problem with <cffile> and the charset
attribute back in CFMX6.1... 7.x... I forget which.

This could potentially be your problem.

--
Adam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 26, 2009 Mar 26, 2009

Copy link to clipboard

Copied

Adam,

I am using CF8.

I double checked and every where in my code where I create or append to the file I make sure that my charset is set to utf-8.

In the cfftp there's a trasnferMode attibute but it can only be set to "ASCII", "ANSI" or "AUTO."

Any more thoughts gentlemen?

Thanks!

Ysais.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 26, 2009 Mar 26, 2009

Copy link to clipboard

Copied

apocalipsis19 wrote:
> I double checked and every where in my code where I create or append to the
> file I make sure that my charset is set to utf-8.

open it in notepad to make sure. using a BOM?

> In the cfftp there's a trasnferMode attibute but it can only be set to
> "ASCII", "ANSI" or "AUTO."

there should be a binary option & i guess that's what you need (provided the
file's utf-8 in the 1st place).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 25, 2009 Mar 25, 2009

Copy link to clipboard

Copied

apocalipsis19 wrote:
> Does anybody else have the same problem? Any thoughts?

first check that the original file is actually encoded as utf-8.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 26, 2009 Mar 26, 2009

Copy link to clipboard

Copied

Adam, Paul and Mack thank you for your responses.

I will review my file and your observations and I will post in here how it went.

Thanks!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 26, 2009 Mar 26, 2009

Copy link to clipboard

Copied

Additional Information

Two of my original files are showing up as ANSI. One of them as UTF-8. The funny thing is that the encoding looks different to me in notepad ++ and Editplus. In the first one the encoding shows as ANSI, but in the latter the encoding shows as UTF-8.

I took some steps further and went to cf_root/lib/neo-runtime.xml and made sure that the default character encoding was set to UTF-8. The more I dig into this the less answers that are popping up in my head. The issue persists :-)

Thanks!

Ysais. Text

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 26, 2009 Mar 26, 2009

Copy link to clipboard

Copied

Paul,

So I could set the trasnferMode attribute to "BINARY"?

Thanks!

Ysais.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 30, 2009 Mar 30, 2009

Copy link to clipboard

Copied

My problem still persists.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 30, 2009 Mar 30, 2009

Copy link to clipboard

Copied

apocalipsis19 wrote:
> My problem still persists.

I think you have only 2 steps CFFILE and CFFTP. I'd check after each
step if the file is *really* UTF-8 reducing the problem in half.

--
Mack

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 30, 2009 Mar 30, 2009

Copy link to clipboard

Copied

Well,

I have done further research on this issue and all of my code is correct. The problem is the underlying JVM. It does nor properly support adding the Byte Order Mark to a UTF-8 file. Some people suggest adding the file through Java code inside the cfscript tags.

I will look into deeper into this and I continue to appreciate any ideas you guys give me!

Thanks!

Ysais.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 30, 2009 Mar 30, 2009

Copy link to clipboard

Copied

apocalipsis19 wrote:
> I have done further research on this issue and all of my code is correct. The
> problem is the underlying JVM. It does nor properly support adding the Byte
> Order Mark to a UTF-8 file. Some people suggest adding the file through Java

a BOM is *optional* for utf-8 by definition (and if you read the definition
you'll see why it's also pretty much un-needed). is the app on the other end
expecting a BOM?

> code inside the cfscript tags.

if your research is correct about the JVM & BOM writing (i think not, it's
optional so the app should handle writing it to a new file), then it's six of
one, half dozen of the other.


what is the app on the other end expecting *exactly*? can you put up the before
& after data (zipped up to preserve encoding)?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 30, 2009 Mar 30, 2009

Copy link to clipboard

Copied

Paul,

The application on the other end is expecting a file UTF-8 encoded. What really troubled me at first is that when I opened the file with EditPlus it said that the file was UTF-8 but when I opened the file with Notepad ++ it said that it was ANSI. My charset attribute is set to UTF-8 in my cffile tags. The transferMode attribute in the cfftp tag is set to BINARY. I will continue submitting the file until I fix this problem.

Mack,

Thanks for the link, I am looking into that. I will post in here whatever happens for future reference or other fellows' reference.

If you guys come up with something else I will be more than happy to read about it.

Thanks!

Ysais.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 30, 2009 Mar 30, 2009

Copy link to clipboard

Copied

apocalipsis19 wrote:
> The application on the other end is expecting a file UTF-8 encoded. What

ok let me try again, a BOM is optional for utf-8 encoded files. is that app
expecting a BOM or just utf-8 with or w/out a BOM?

> really troubled me at first is that when I opened the file with EditPlus it
> said that the file was UTF-8 but when I opened the file with Notepad ++ it said

don't use either but did you save or modify the file in any way?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 30, 2009 Mar 30, 2009

Copy link to clipboard

Copied

apocalipsis19 wrote:
> Well,
>
> I have done further research on this issue and all of my code is correct. The
> problem is the underlying JVM. It does nor properly support adding the Byte
> Order Mark to a UTF-8 file. Some people suggest adding the file through Java
> code inside the cfscript tags.
>
> I will look into deeper into this and I continue to appreciate any ideas you
> guys give me!

I found this java bug that is related to the problem. It's about reading
UTF-8 files with BOM but if it's not transparent on read I doubt it's
tranparent on write:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058

--
Mack

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 30, 2009 Mar 30, 2009

Copy link to clipboard

Copied

Mack wrote:
> I found this java bug that is related to the problem. It's about reading
> UTF-8 files with BOM but if it's not transparent on read I doubt it's
> tranparent on write:
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058

and sun marked that "bug" as "Closed, Will Not Fix". sun's not going to fix
something that it considers not "broken" (there are also "bugs" related to java
not compiling source with a BOM as well) or that will create backwards
compatibility problems--a BOM is optional for utf-8 (and pretty much useless in
utf-8 anyway) but required for utf-16 which java handles ok (if i remember rightly).

and just an FYI, sun usually gives i18n bugs short shrift. some locale resource
bugs (and i mean real bugs like stuff where the get currency/numeric formatting
dead wrong) have been around for >5 years.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 31, 2009 Mar 31, 2009

Copy link to clipboard

Copied

Thanks Paul!

The file should just be UTF-8. That would solve my problem.


I am just opening the file in those text editors to see the encoding of the file.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 31, 2009 Mar 31, 2009

Copy link to clipboard

Copied

apocalipsis19 wrote:
> The file should just be UTF-8. That would solve my problem.

again, can i see a zipped up version before & after uploading?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 31, 2009 Mar 31, 2009

Copy link to clipboard

Copied

Sure! How do I send it to you?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 31, 2009 Mar 31, 2009

Copy link to clipboard

Copied

Paul,

You said here:

"don't use either "

I think that the party I am sending this file to is just opening the file in Notepad ++ and when he sees it says ANSI there he requests a different file. This file is supposed to be processed in their servers but I haven't got any output from the processing software just the feedback from this guy that administers the servers.

This turned out to be a big project in time terms for me.

Thanks a lot!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 31, 2009 Mar 31, 2009

Copy link to clipboard

Copied

LATEST
Ok, I just sent them to you.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation