• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

get non unicode character in pdf

Guest
Dec 06, 2017 Dec 06, 2017

Copy link to clipboard

Copied

Hi Team,

It is possible to get the all Non-Unicode characters from pdf text?, If possible which method should i use.

Thanks,

Maruthu

TOPICS
Acrobat SDK and JavaScript

Views

919

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Dec 07, 2017 Dec 07, 2017

Copy link to clipboard

Copied

Please define “non Unicode character” precisely as it is not a PDF concept.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 07, 2017 Dec 07, 2017

Copy link to clipboard

Copied

LATEST

Export the PDF as Plain Text, I believe this will convert all characters to UTF-8.

Detecting non-unicode characters is a different thing. You can't do that with JavaScript because JS converts everything into Unicode.

Have you looked at the Preflight "Browse Internal Structure" tool? This shows you details of the fonts.

Here is a better tool that allows you to select text and then it shows the properties.

Windjack Solutions, Inc. - PDf CanOpener

You can do this pragmatically with the C++ Plug-in SDK. Is this an option?

Thom Parker - Software Developer at PDFScripting
Use the Acrobat JavaScript Reference early and often

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines