Font encoding problems when opening v7 in v10

Report · Nov 16, 2012

Hi, I recently upgraded to v10, but still have quite some old content originally created in v7. While the content looks pretty ok on screen when i open for instance a Russian file, it looks like total disaster when creating a MIF file out of it. Seems like somewhere in the process the whole encoding goes banana.

The fonts used originally were HelveticaCyr Upright and Times Ten Cyr Upright. When I install these on my machine the content looks pretty OK on my screen, but I can't change the font to anything else without totally screwing up the layout. Changing it to any other font supporting Cyrillic (for instance plain old Arial) makes the content look gibberish again

An example : Take following Russian word : Оглавление (means content for the ones intrested...)

When I copy this word freshly entered from a texteditor into FM10 using one of the Cyr Upright fonts it's shown as ????????. Changing the font to for instance Arial makes it show up as expected.

When I select the original word in FM (so looking ok on screen) and change the font to Arial it looks like Îãëàâëåíèå. Same behaviour happens if I copy the word directly from FM10 to any text editor. So, while all looks ok on screen when I open the file, I can't change the font without destroying the document, and worst of all, when I export to MIF all my Russian is converted to unreadable strings looking something like Îãëàâëåíèå

It's driving me nuts, anybody has a clue where my problem is located ?

Thanks in advance !

Report · Nov 16, 2012

I don't have a clue as to where the problem is, but I've found after doing

upgrades for years and years that I generally get better results saving the

lower number file out as MIF and then opening it in the new version. Seems

to have the end effect of cleaning up things better than either just

opening the lower number .fm file or opening it in the new package and then

saving out as MIF.

Art

Art Campbell

art.campbell@gmail.com

"... In my opinion, there's nothing in this world beats a '52 Vincent and

a redheaded girl." -- Richard Thompson

No disclaimers apply.

DoD 358

I support www.TheGrotonLine.com, hyperlocal news for Groton MA.

Report · Nov 16, 2012

The whole problem is that it saves the same garbage in the MIF file, so it's only readable inside Framemaker, when using the old fonts. I've tried to 're-code'the weird characters (the Îãëàâëåíèå stuff), but without any luck

Report · Nov 16, 2012

> The whole problem is that it saves the same garbage in the MIF file, so it's only readable inside Framemaker, when using the old fonts.

Yep, and that's the case for all legacy codepage apps and their documents. This is major part of why Unicode was invented.

> I've tried to 're-code'the weird characters (the Îãëàâëåíèå stuff), but without any luck.

Those aren't weird. They are merely the roman/western glpyhs for the 8-bit code points of what are Russian characters in a legacy codepage Cyrillic font.

You need a code page to unicode converter. There are commercial products for that. There may be free ones, or even web pages that you can paste legacy into, selected lang, and copy UTF8 back out of. Never having needed to do this, the foregoing is already more than I know.

Monitor this thread for a few days. Perhaps some old hands who went through all of this back at FM8 will recall the process.

Report · Nov 16, 2012

Your FM7 document is using legacy "overlay" aka "code page" cyrillic font technology. The Russian text was created by applying the Cyr font in Paragraph Format, Character Format, or just as a local override, then typing the 8-bit code points for the glyphs in that font (which map to various accented characters in a roman font, as you see).

What's encoded in the .fm file (and .mif) for that text is 8-bit characters which could be displayed as roman, symbol, dingbat, cyrillic, Thai, or other glyph families, depending on what Font Family is applied.

FM8 and later, including your FM10, support Unicode (UTF8, specifically), in which every glyph, from every character set, planet-wide, has a unique 3-byte code point. No overlays. No confusion about whether any byte triplet is supposed to be a D or a gamma. The only issue is whether or not the font has that glyph code point populated with outlines.

FM apparently continues to support legacy overlay fonts, as long as the font is loaded on the machine during edit (and this is as it has always been).

The real concern here is: for FM8 and later, was any legacy-to-Unicode conversion supposed to be happening when you open a legacy document in a Unicode FM?

I'm guessing that it doesn't happen, because there would have to be some dialog controlling how to map the legacy font to some arbitrary Unicode font. FM would have to know what the language (glyph map) was for the old font, and would have to verify that each code point exists in the new Unicode font - seems unlikely.

We haven't needed to do this, so I have no direct experience with how it is supposed to work (I'm only watching the issue so we handle our non-roman special character usage in a portable way). This Adobe FM forum doesn't seem to go back to the FM8 transition period, and thus hasn't much content on this issue. There may be some work-arounds. Possibly FM gets a little more magical if your PC is set to Russian when you open the old doc in FM10.

Report · Nov 16, 2012

No magic here, alas... Tried to make my machine believe it's russian but no change

Report · Nov 16, 2012

You could try a utility like this: http://www.codeproject.com/Articles/18393/CodePage-File-Converter

However, it wants a text file, so you'll need to re-tag afterwards. The alternatives are to use brute force find/change to modify the old FrameRoman cyrllic codepoint values to unicode ones or to create script (Extendscript or Framescript) that essentially does the samething as the manual approach. A royal PITA....

AFAIK, there never was any techdoc or blog entry created by Adobe on how to handle legacy non-codepage 1252 documents when migrating to the unicode versions starting with FM8.

Adobe Community

Font encoding problems when opening v7 in v10