21 Replies Latest reply on Mar 31, 2009 8:16 AM by apocalipsis19

    cffile and UTF-8

    apocalipsis19 Level 1
      Hello Community!

      I have a program that uploads a file to a remote FTP server. I am using cffile to write the file there and it MUST be uploaded in UTF-8 format. Despite that, the file is being uploaded as ascii or ansi, anything except UTF-8.

      This is my line of code:
      <cffile action="write" file="#f_dir##f_name#" output="#dataHeader#" charset="utf-8">

      charset="utf-8" is not working for me.

      Does anybody else have the same problem? Any thoughts?

      Thanks!

      Ysais.
        • 1. Re: cffile and UTF-8
          Level 7
          apocalipsis19 wrote:
          > Hello Community!
          >
          > I have a program that uploads a file to a remote FTP server. I am using cffile
          > to write the file there and it MUST be uploaded in UTF-8 format. Despite that,
          > the file is being uploaded as ascii or ansi, anything except UTF-8.

          CFFILE is independent from CFFTP (I'm assuming you're using CFFTP to
          upload to the remote server). I think you're searching for the
          transferMode attribute of the CFFTP tag.

          --
          Mack
          • 2. Re: cffile and UTF-8
            Level 7
            > charset="utf-8" is not working for me.

            What version of CF? There was a problem with <cffile> and the charset
            attribute back in CFMX6.1... 7.x... I forget which.

            This could potentially be your problem.

            --
            Adam
            • 3. Re: cffile and UTF-8
              Level 7
              apocalipsis19 wrote:
              > Does anybody else have the same problem? Any thoughts?

              first check that the original file is actually encoded as utf-8.
              • 4. Re: cffile and UTF-8
                apocalipsis19 Level 1
                Adam, Paul and Mack thank you for your responses.

                I will review my file and your observations and I will post in here how it went.

                Thanks!
                • 5. Re: cffile and UTF-8
                  apocalipsis19 Level 1
                  Adam,

                  I am using CF8.

                  I double checked and every where in my code where I create or append to the file I make sure that my charset is set to utf-8.

                  In the cfftp there's a trasnferMode attibute but it can only be set to "ASCII", "ANSI" or "AUTO."

                  Any more thoughts gentlemen?

                  Thanks!

                  Ysais.
                  • 6. cffile and UTF-8
                    apocalipsis19 Level 1
                    Additional Information

                    Two of my original files are showing up as ANSI. One of them as UTF-8. The funny thing is that the encoding looks different to me in notepad ++ and Editplus. In the first one the encoding shows as ANSI, but in the latter the encoding shows as UTF-8.

                    I took some steps further and went to cf_root/lib/neo-runtime.xml and made sure that the default character encoding was set to UTF-8. The more I dig into this the less answers that are popping up in my head. The issue persists :-)

                    Thanks!

                    Ysais. Text
                    • 7. Re: cffile and UTF-8
                      Level 7
                      apocalipsis19 wrote:
                      > I double checked and every where in my code where I create or append to the
                      > file I make sure that my charset is set to utf-8.

                      open it in notepad to make sure. using a BOM?

                      > In the cfftp there's a trasnferMode attibute but it can only be set to
                      > "ASCII", "ANSI" or "AUTO."

                      there should be a binary option & i guess that's what you need (provided the
                      file's utf-8 in the 1st place).
                      • 8. Re: cffile and UTF-8
                        apocalipsis19 Level 1
                        Paul,

                        So I could set the trasnferMode attribute to "BINARY"?

                        Thanks!

                        Ysais.
                        • 9. Re: cffile and UTF-8
                          apocalipsis19 Level 1
                          My problem still persists.
                          • 10. Re: cffile and UTF-8
                            Level 7
                            apocalipsis19 wrote:
                            > My problem still persists.

                            I think you have only 2 steps CFFILE and CFFTP. I'd check after each
                            step if the file is *really* UTF-8 reducing the problem in half.

                            --
                            Mack
                            • 11. Re: cffile and UTF-8
                              apocalipsis19 Level 1
                              Well,

                              I have done further research on this issue and all of my code is correct. The problem is the underlying JVM. It does nor properly support adding the Byte Order Mark to a UTF-8 file. Some people suggest adding the file through Java code inside the cfscript tags.

                              I will look into deeper into this and I continue to appreciate any ideas you guys give me!

                              Thanks!

                              Ysais.
                              • 12. Re: cffile and UTF-8
                                Level 7
                                apocalipsis19 wrote:
                                > I have done further research on this issue and all of my code is correct. The
                                > problem is the underlying JVM. It does nor properly support adding the Byte
                                > Order Mark to a UTF-8 file. Some people suggest adding the file through Java

                                a BOM is *optional* for utf-8 by definition (and if you read the definition
                                you'll see why it's also pretty much un-needed). is the app on the other end
                                expecting a BOM?

                                > code inside the cfscript tags.

                                if your research is correct about the JVM & BOM writing (i think not, it's
                                optional so the app should handle writing it to a new file), then it's six of
                                one, half dozen of the other.


                                what is the app on the other end expecting *exactly*? can you put up the before
                                & after data (zipped up to preserve encoding)?
                                • 13. Re: cffile and UTF-8
                                  Level 7
                                  apocalipsis19 wrote:
                                  > Well,
                                  >
                                  > I have done further research on this issue and all of my code is correct. The
                                  > problem is the underlying JVM. It does nor properly support adding the Byte
                                  > Order Mark to a UTF-8 file. Some people suggest adding the file through Java
                                  > code inside the cfscript tags.
                                  >
                                  > I will look into deeper into this and I continue to appreciate any ideas you
                                  > guys give me!

                                  I found this java bug that is related to the problem. It's about reading
                                  UTF-8 files with BOM but if it's not transparent on read I doubt it's
                                  tranparent on write:
                                  http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058

                                  --
                                  Mack
                                  • 14. Re: cffile and UTF-8
                                    apocalipsis19 Level 1
                                    Paul,

                                    The application on the other end is expecting a file UTF-8 encoded. What really troubled me at first is that when I opened the file with EditPlus it said that the file was UTF-8 but when I opened the file with Notepad ++ it said that it was ANSI. My charset attribute is set to UTF-8 in my cffile tags. The transferMode attribute in the cfftp tag is set to BINARY. I will continue submitting the file until I fix this problem.

                                    Mack,

                                    Thanks for the link, I am looking into that. I will post in here whatever happens for future reference or other fellows' reference.

                                    If you guys come up with something else I will be more than happy to read about it.

                                    Thanks!

                                    Ysais.
                                    • 15. Re: cffile and UTF-8
                                      Level 7
                                      Mack wrote:
                                      > I found this java bug that is related to the problem. It's about reading
                                      > UTF-8 files with BOM but if it's not transparent on read I doubt it's
                                      > tranparent on write:
                                      > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058

                                      and sun marked that "bug" as "Closed, Will Not Fix". sun's not going to fix
                                      something that it considers not "broken" (there are also "bugs" related to java
                                      not compiling source with a BOM as well) or that will create backwards
                                      compatibility problems--a BOM is optional for utf-8 (and pretty much useless in
                                      utf-8 anyway) but required for utf-16 which java handles ok (if i remember rightly).

                                      and just an FYI, sun usually gives i18n bugs short shrift. some locale resource
                                      bugs (and i mean real bugs like stuff where the get currency/numeric formatting
                                      dead wrong) have been around for >5 years.

                                      • 16. Re: cffile and UTF-8
                                        Level 7
                                        apocalipsis19 wrote:
                                        > The application on the other end is expecting a file UTF-8 encoded. What

                                        ok let me try again, a BOM is optional for utf-8 encoded files. is that app
                                        expecting a BOM or just utf-8 with or w/out a BOM?

                                        > really troubled me at first is that when I opened the file with EditPlus it
                                        > said that the file was UTF-8 but when I opened the file with Notepad ++ it said

                                        don't use either but did you save or modify the file in any way?
                                        • 17. Re: cffile and UTF-8
                                          apocalipsis19 Level 1
                                          Thanks Paul!

                                          The file should just be UTF-8. That would solve my problem.


                                          I am just opening the file in those text editors to see the encoding of the file.
                                          • 18. Re: cffile and UTF-8
                                            Level 7
                                            apocalipsis19 wrote:
                                            > The file should just be UTF-8. That would solve my problem.

                                            again, can i see a zipped up version before & after uploading?
                                            • 19. Re: cffile and UTF-8
                                              apocalipsis19 Level 1
                                              Paul,

                                              You said here:

                                              "don't use either "

                                              I think that the party I am sending this file to is just opening the file in Notepad ++ and when he sees it says ANSI there he requests a different file. This file is supposed to be processed in their servers but I haven't got any output from the processing software just the feedback from this guy that administers the servers.

                                              This turned out to be a big project in time terms for me.

                                              Thanks a lot!
                                              • 20. Re: cffile and UTF-8
                                                apocalipsis19 Level 1
                                                Sure! How do I send it to you?
                                                • 21. Re: cffile and UTF-8
                                                  apocalipsis19 Level 1
                                                  Ok, I just sent them to you.