11 Replies Latest reply on Jan 4, 2008 10:47 AM by ssawka

    Can't be done?

    akuo
      I sent this in a couple weeks ago and sadly got no reponse...so, trying 1 more time.

      I'm using CF to import data from an EXCEL sheet that has a mix of numbers and text onto a SQL Server database. I get a successful import of numbers, but have problems with the text import. Specifically, quotes, apostrophes, dashes, etc are converted to null (?) values--when you output onto a webpage, they turn into "square boxes". Anyone run into this problem and knows a fix?

      FYI. I can't set up DTS (as some have recommended) since database server is housed outside my domain.

      Also, I would like to keep the format of the text if possible (ex. breaks, bold, italics, etc.)

      The code below is the SELECT statement from the EXCEL worksheet, the ensuing statement (not shown) is the INSERT into the SQL Server.

      Thanks in advance!
        • 1. Re: Can't be done?
          Level 7
          "Anyone run into this problem and knows a fix?"

          Question Marks and Square Boxes replacing characters is an indication of
          character encoding issues. Somewhere along the line between the
          original Excel spread sheet and the output to a browser something is not
          understanding the character data it is receiving and substituting these
          symbols to indicate where the unknown characters are in the data.

          The fix is to know what character encoding you need to work with and to
          make sure that everything is configured to use the same character
          encoding. This can require modifications in your import mechanism, your
          database configuration, ColdFusion and HTML in insure that the same
          encoding is used throughout.

          HTH
          Ian


          • 2. Re: Can't be done?
            akuo Level 1
            Thanks Ian for your input.

            Yes, I believe you are correct. Even though I cannot create a DTS on production, I have made an attempt to import on my development platform via DTS and still have the same problem of the "unknown characters". And thus concluding that it was an import issue, I have tried using other drivers (EXCEL 4.0, etc) without any success.

            "...make sure that everything is configured to use the same character encoding." Do you (or anyone) have an idea how?

            Thanks!
            • 3. Re: Can't be done?
              Level 7
              akuo wrote:
              > I'm using CF to import data from an EXCEL sheet that has a mix of numbers and
              > text onto a SQL Server database. I get a successful import of numbers, but have
              > problems with the text import. Specifically, quotes, apostrophes, dashes, etc
              > are converted to null (?) values--when you output onto a webpage, they turn
              > into "square boxes". Anyone run into this problem and knows a fix?

              ? == garbaged chars, fatal encoding flaw
              [] == means the browser can't render that char using the selected font, could be
              minor encoding issue.

              > <cfquery name="getcomments" datasource="excelimport">
              > SELECT `Program Name` AS ProgramName,
              > `Budget Comments` AS comments
              > FROM [Budget_Comments$]
              > IN '#ExpandPath('./')#budget_plan.xls' 'EXCEL 5.0;'
              > </cfquery>

              can you simply dump out the results of this query? does the text look ok? i
              *think* excel's default encoding is utf-16 while cf's default is utf-8.
              • 4. Re: Can't be done?
                Level 7
                First things first. Have you identified where the problem lies. If you
                are only using a browser to view the imported data from the database and
                the browser is where the problem lies. It is entirly possible that the
                data was successfully imported into to the database but is becoming
                garbled upon retrieval for display in a browser.

                In other words, can you view the database data with a direct tool to
                confirm whether or not the data is garbled in it?
                • 5. Re: Can't be done?
                  akuo Level 1
                  Thank you both for your inputs...I believe you are both correct and have pinpointed the cause. But I still am not sure of the solution.

                  After importing into the SQL Server database via CF, I do see the quotations, apostrophes, dashes when I directly view the data using Enterprise Manager. However, when I view the data using Internet Explorer (v6.0.29), I get the "box"... [] ...for those grammatical marks. When I view using Mozilla Firefox (v2.0.0.6), those grammatical marks do not display (substituting a space for quotes and ignoring the dashes--> "1-23" will yield "123").

                  However, when I go to the EXCEL file, and re-type the quotation marks, apostrophes & dashes...the re-typed portions do display on both IE & Mozilla. (The customer does a copy & paste into the EXCEL...I can't control what he inputs).

                  So, I have tried changing the text fonts on the EXCEL sheet to Times, Arial, Courier...but to no avail. Not familiar with encoding, I'm assuming that there are various "codes" for the quotation mark, etc...and hence, the browser does not understand this version's code?

                  Is there a solution to this?

                  Thanks again for your help!!
                  • 6. Re: Can't be done?
                    Level 7
                    Sounds like you are dealing with Microsoft's extended character set
                    (windows-1252). Under default conditions Microsoft products, especially
                    Word, will use extended "curly" quotes, em-dashes, en-dashes and other
                    characters outside the normal ASCII set.

                    You can use the ColdFusion and HTML functions to control what character
                    sets the templates used to upload and output the data from the database.
                    The 'Developing Globalized Applications' chapter of the CF
                    documentation discusses all these options, it is more then a one line
                    step, but with a few tags it is usually pretty easy to get CFML and it's
                    output HTML to work with any desired character encoding.

                    Or you can do some type of pre-processing to the uploaded data to
                    convert the extended characters into the more universal basic ASCII
                    versions. I know in MS Word there is an option to save a file as a
                    "plain" file without these extended characters. I am not sure if Excel
                    has the same feature. But maybe exporting the data into a plain CSV or
                    Text file may do this. Some experimentation may be worthwhile.
                    • 7. Re: Can't be done?
                      akuo Level 1
                      And all this time, I thought I knew it all going though Ben Forta's books...I'll go through the CF documentation you mentioned above and let you know how it goes...sounds challenging & frustrating (and kind of fun too).

                      Of course, if someone has already gone through this and wants to share the solution...

                      Thanks!
                      • 8. Re: Can't be done?
                        Level 7
                        akuo wrote:

                        I have gone through this, which is why I am vary familiar with this
                        section of the documentation. The trouble is there is not "A" solution.
                        It is all choices and trade offs, depending on your systems, your
                        requirements and what you need to do. This is one of those places where
                        we can just point you to the path, it is up to you to walk the path.

                        • 9. Re: Can't be done?
                          Level 7
                          akuo wrote:
                          > And all this time, I thought I knew it all going though Ben Forta's
                          > books...I'll go through the CF documentation you mentioned above and let you

                          there's a chapter on cf globalization in the advanced cf books, did you read those?
                          • 10. Re: Can't be done?
                            Level 7
                            akuo wrote:
                            > After importing into the SQL Server database via CF, I do see the quotations,
                            > apostrophes, dashes when I directly view the data using Enterprise Manager.

                            what datatype for those columns?

                            > However, when I go to the EXCEL file, and re-type the quotation marks,
                            > apostrophes & dashes...the re-typed portions do display on both IE & Mozilla.
                            > (The customer does a copy & paste into the EXCEL...I can't control what he
                            > inputs).

                            you changed the codepoints for those "chars", while they might look similar
                            they're different.

                            > So, I have tried changing the text fonts on the EXCEL sheet to Times, Arial,
                            > Courier...but to no avail. Not familiar with encoding, I'm assuming that there

                            that's excel & nothing to do w/cf & the db.

                            • 11. Re: Can't be done?
                              ssawka Level 1
                              I remember at one point we used ListReplace to replace Microsoft's auto-correct characters with ASCII special characters, but you would need to find a list of the Microsoft specail characters.