• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
Locked
1

Importing CSV file with Data Merge Fails

Explorer ,
Mar 23, 2013 Mar 23, 2013

Copy link to clipboard

Copied

Specs

See pasted text from CSV at http://pastebin.com/mymhugpN

I am using InDesign CS6 (8.0.1)

I created the CSV by downloading it from a Google Spreadsheet as a CSV. I confirm with the Terminal that the character encoding is utf-8 usnig the file command.

Problem detailed

I am trying to import a CSV file (utf-8) with Data Merge via the Select Data Source... command with Show Import Options checked. When viewing the Data Source Import Options dialog, I set the following options—Delimiter:Comma, Encoding:Unicode, Platform:Macintosh. I leave Preserve Spaces in Data Source unchecked. It fails to import any variables and produces no error message. I have tried other CSV files as well (created TextEdit, Espresso, etc.) and it seems that InDesign will not import any files if Unicode is specified as the encoding, no matter which other options are specified.

Can anyone else confirm this?

Importing as ACSII works, but obviously does not display my content correctly.

Views

43.1K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 24, 2013 Mar 24, 2013

Copy link to clipboard

Copied

pattern86 wrote:

Importing as ACSII works, but obviously does not display my content correctly.

Obviously? What's obvious? What isn't being displayed correctly? I'm not sure that pastebin supports non-ascii characters, but I don't see anything in the paste in a quick look that should be a problem if the file is saved in ascii format.

If you want someone to test the file to see the results we'll need a link to a sample file.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 24, 2013 Mar 24, 2013

Copy link to clipboard

Copied

It can be any valid unicode CSV file. One column or seven columns, one row or fifty rows, just alphabetic letters or numbers or unicode-only characters... it doesn't matter. None of them will import if Unicode is selected in the Data Source Import Options dialog.

Use what's in the pastebin link or copy and paste this into a new text file and save it as a utf-8 CSV and test:

Variable 1,Variable 2,Variable 3

John,Kevin,yes

Marsha,Jan,no

Roger,Bridget,no

Stephanie,Larz,yes

(And pastebin does support unicode characters—there are several left- and right-double-quotes in my example.)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 24, 2013 Mar 24, 2013

Copy link to clipboard

Copied

I'm not going to claim to be an expert on file encodings, but if I copy your text and save in Notepad (I'm a windows User) as UTF-8 I am presented with the following error:

Bad CSV UTF-8.png

If I susbsquently open in Excel and remove the added characters before the text in the first cell (Which I presume are the UTF-8 binary tag) I am able to load the file as ASCII and do the merge.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 24, 2013 Mar 24, 2013

Copy link to clipboard

Copied

I also tried saving as Unicode text instead of UTF-8 in Notepad, and that works fine. As I say, I'm not expert in file encoding, but I think you may be under a misaprehension that UTF is Unicode encoding, and it simply is not a format that ID will read for Data Merge.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 24, 2013 Mar 24, 2013

Copy link to clipboard

Copied

Perhaps it is a problem with the Macintosh version.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 25, 2013 Mar 25, 2013

Copy link to clipboard

Copied

I don't think so. I think it's a problem with UTF -8 encoding. Can you save as a different flavor of unicode from Google?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 25, 2013 Mar 25, 2013

Copy link to clipboard

Copied

I've had a similar problem with text that was generated on Filemaker Pro for Mac. Drove me INSANE. Was with a former employer and has not been an issue since, but I had a somewhat clumsy workaround:

  1. Make a new InDesign document and place the csv text into indesign as a textframe.
  2. While in the textframe with the text tool, select file/export and export as text.
  3. Go back to the data merge file and from the data merge panel, choose select data source. The data should now import... but some of the characters will go haywire.

I haven't really investigated what was going on as the job was always red-hot urgent, and I haven't been able to replicate the fault as I don't have Filemaker Pro. If anyone reading would like to try this,  have at it!

If the answer wasn't in my post, perhaps it might be on my blog at colecandoo!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 25, 2013 Mar 25, 2013

Copy link to clipboard

Copied

Google Drive only supports one format for exporting to CSV. I've tried converting the exported file to utf-8, utf-16—no dice. The best I can come up with is to import a unicode file as ASCII and then Find & Replace all out of range characters that were modified (sometimes lots of different characters, depending on the content).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 25, 2013 Mar 25, 2013

Copy link to clipboard

Copied

Waht about opeing the UTF-8 file in Text Edit or Text wrangler and resaving from there as Unicode?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advisor ,
Mar 25, 2013 Mar 25, 2013

Copy link to clipboard

Copied

I had no problem importing your data as either .txt or .csv. Granted I run CS5.5 but am also on a Mac and it went smoothly. I basically copied your data to TextEdit, saved and then imported through Data Merge. To get the CSV I opened that text in Excel and output as comma de-limited CSV.

It looks like your data (in PasteBin) has 2 different encodings, or at least has been created by 2 different software programs or maybe computers. You can see this in the text:

Screen Shot 2013-03-25 at 5.48.21 PM.png

As you can see, there are two different quotation marks going on - the first looks like it's been copied straight from Word and the second looks to have been generated by a text or HTML program. This is also a common problem with websites when people supply text in Word/Excel and copy straight into HTML - those characters get messed up.

Run a Find and Replace on the first set of quotes in that text and the file should load with no problem.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 29, 2013 Mar 29, 2013

Copy link to clipboard

Copied

I think this is the problem, that my content requires all of these characters. UTF-8 supports it, but InDesign won't seem to read UTF-8.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Valorous Hero ,
Mar 30, 2013 Mar 30, 2013

Copy link to clipboard

Copied

I downloaded your sample. As per my editor of choice, it shows it is UTF-8 format:

capture-000179.png

After setting up ID to merge it:

capture-000178.png

Seems to work just fine. Unless I am missing something.

Take care, Mike

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 30, 2013 Mar 30, 2013

Copy link to clipboard

Copied

What editor are you using?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Valorous Hero ,
Mar 30, 2013 Mar 30, 2013

Copy link to clipboard

Copied

Hi Peter, UltraEdit. Been using it since the mid 1990s for programming and then later, for HTML, PHP and what not.

As regards this issue, UE will preserve the BOM with UTF files if it is set to do so (and not convert them to the OS'  format). Which is how I have it set. If one sees the odd characters at the beginning of a file, it is due to conversion to the OS format one is using. Otherwise the BOM is hidden from view.

Down towards the bottom of their page linked below explains the BOM if anyone cares. For UTF files it is necessary that these be preserved.

http://www.ultraedit.com/support/tutorials_power_tips/ultraedit/unicode.html

Take care, Mike

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 30, 2013 Mar 30, 2013

Copy link to clipboard

Copied

Thanks.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 30, 2013 Mar 30, 2013

Copy link to clipboard

Copied

Did you set it up the exact way I described? What is your InDesign version number and on which platform?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Valorous Hero ,
Mar 31, 2013 Mar 31, 2013

Copy link to clipboard

Copied

Sorry, been away at a weekend golf tournament. Tried to reply with my phone but up in the mountains I waas cut off on both reply attampts.

I am using CS6.

Your file is a UTF-8 ASCII file. If you want Unicode, it will be a UTF-16 file and as long as you are using Western languages, you really should be using the UTF-8 ASCII.

If you create a UTF-16 format CSV file, you will see that ID automatically determines it is the Unicode import type. Now, they are both Unicode (UTF-8 and UTF-16 that is). But to ID, I think the UTF-16 format(s) are used specifically for non-Western languages, or certainly a mix of Western and non-Western languages.

At least that's what I find.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Apr 01, 2013 Apr 01, 2013

Copy link to clipboard

Copied

When I import that utf-8 file as Unicode, nothing imports. When I import it as ASCII, I get various character conversions, such as

“ becomes ‚Äú

” becomes ‚Äù

’ becomes ‚Äô

… becomes ‚Ķ

Рbecomes –

and so on...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Valorous Hero ,
Apr 01, 2013 Apr 01, 2013

Copy link to clipboard

Copied

When I import that utf-8 file

Is this the file you uploaded to pastebin? Frankly I do not see those characters listed in post 18 in that file. If the missing characters you list in post 18 are suppose to be in the text, try creating a ZIP file of the text (both the .txt and the source Google Doc spreadsheet, maybe saved as an Excel or other spreadsheet type) and posting it somewhere like dropbox. Then provide a link.

The reason it will not import as Unicode, I believe, is that it is a UTF-8 file and is considered as ASCII for import.

Mike

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Apr 01, 2013 Apr 01, 2013

Copy link to clipboard

Copied

The source data never existed as a .txt file. It was typed directly into a Google Spreadsheet. Here is a link to a zip containing the originally-downloaded .csv from Google Docs and then a utf-16 version that I converted using TextEdit on OS X.

http://cl.ly/3O2E0O0Y0g2U

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Valorous Hero ,
Apr 01, 2013 Apr 01, 2013

Copy link to clipboard

Copied

Ok. I think that we aren't getting anywhere quick at my end. I still do not see any upper-range characters in your file(s).

I went to a web site and copied a bunch of characters and made a table. I added dollar signs where their characters were missing. But I think it illustrates that if the file is set correctly, ID automatically picks the proper format for import.

capture-000181.png

And the result when merged into a new file:

capture-000180.png

Converting the source records as an ANSI text file, and importing it, ID chooses:

capture-000182.png

And the resultant merge equals:

capture-000183.png

Mike

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 01, 2013 Apr 01, 2013

Copy link to clipboard

Copied

I downloaded the sample files, and I get the same character mapping issues. I wonder if this might be font-related. Is the font in Google spreadsheet really unincode compliant, or is it some older font that has Windows -Mac character mapping differences?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Valorous Hero ,
Apr 01, 2013 Apr 01, 2013

Copy link to clipboard

Copied

Google Doc export of the above here:

http://www.wenzloffandsons.com/temp/google_doc_export.csv

It is a UTF-8 ASCII.

Which means if you import it allowing ID to set the import type (Ascii) then it works fine.

Want me to ZIP up a Unicode (UTF-16) version? ID will then automagically pick Unicode for the type and the imports will be identical.

Mike

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 02, 2013 Apr 02, 2013

Copy link to clipboard

Copied

Mike is having some trouble posting in this thread (and I am too), but he sent me a PM with what he wanted to say:

OK. I think I might have a positive answer for you.

I was getting lost in the upper ASCII characters you showed. In your test file I never could see any--a case of not seeing the trees for the forest.

Your quote marks are getting dropped in your test file. Now, this may or may not affect other factors but it does in some further testing. I believe ID has an issue with dropping quote marks even in a plain ASCII file if the marks are at the beginning of a sentence and the file is tab delimited. Call it a bug.

Because of all the commas and quote marks in your simple file, I think you should be exporting from Google Docs' spreadsheet as a tab-delimited file. This exported file has to be opened in a text editor capable of saving it out as a UTF-16 BE (Big Endian) type of file.

Also, I think you are going to have to use proper quote marks throughout, or change them in the exported tab-delimited file. Best to have a correct source, though.

Here is your sample ZIPped up. I think it works properly. But then again, I think I might be bleary-eyed by now.

http://www.wenzloffandsons.com/temp/merge_psalms_utf-16.zip

Take care, Mike

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines