• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Identity-H

Guest
Nov 29, 2010 Nov 29, 2010

Copy link to clipboard

Copied

Hi,

I am  having issue in the Adobe Acrobat while extracting text. If any font  that having the encoding identity-h the text could not  extract.

While copy the all text in PDF and paste in the notepad it shows like "?" if the fonts encoding is "Identity-H".

Suppose the font encoding is "Custom",  "Ansi" or "Type 1". I didn't find any issues while extracting Text.

Your help in this regard is greatly appreciated.

Thanks,

Arun Segar

Views

45.9K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 29, 2010 Nov 29, 2010

Copy link to clipboard

Copied

Impossible to say, what causes the problems.

You could try to open the PDF in another app like Illustrator and save again.

I was able to solve a similar problem in the past with this method.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 29, 2010 Nov 29, 2010

Copy link to clipboard

Copied

Generally, the font you are using is not on your system and Acrobat can not find a compatible font to substitute in the conversion.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Nov 30, 2010 Nov 30, 2010

Copy link to clipboard

Copied

Hi Guys,


If I copy and paste a text from PDF file some of the text are coming as ? or box character. While checking I that found the font is embedded but its encoding is identity-h.

So only the text is not coming correctly. Is there any solution to extract a text from PDF if the font encoding is identity-h.

Note: I Tried to extract the text using Indesign and Illustrator same thing happened, the text came as ?. Through Indesign I saved the PDF as .ps and save it as PDF again the text came as ?.

If there is any other option tell me and also tell me if it is possible to solve these issue using Adobe SDK.

I tried the steps told in the below site but it is not usefull:

http://www.yawah.com/en/support/knowledgebase/fsi-pages/kb-0307-identity-h-fonts

Please tell us ur suggestion.

Thanks,

Arun Segar

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 30, 2010 Nov 30, 2010

Copy link to clipboard

Copied

A lengthy work-around is to save the file as a JPeg (or TIFF) and then open the graphics files as a new PDF. Then run OCR to get a common font type. Then you should be able to copy & paste. Be aware that OCR tends to have errors, depending on the quality of the graphic (need at least 300 dpi typically).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 11, 2011 Jun 11, 2011

Copy link to clipboard

Copied

This option looks like is working, thanks a lot... It look at the beguining like if is impossible but actually worked! manny manny thanks!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 15, 2019 Sep 15, 2019

Copy link to clipboard

Copied

Hello,

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 15, 2019 Sep 15, 2019

Copy link to clipboard

Copied

LATEST

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 30, 2010 Nov 30, 2010

Copy link to clipboard

Copied

This is relatively common, and is caused when the application creating the PDF fails to correctly embed the Unicode lookup table for the font. Without that lookup table there is no relationship between the visible character on screen and the equivalent character code, so copying and pasting the text will lead to either a series of unknown markers, or a jumble of characters with a 1:1 relationship to the original text.

As a PDF stores the character codes rather than the human-readable text, the fact you can see a letter "A" on the page doesn't mean Acrobat has any idea that it's an "A". The lookup tables make that connection, so if they're missing or corrupted there's no way to recreate the semantic connection unless you can re-fry the file with an original copy of the font.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines