• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Generating .pdf files from .txt files for all code pages

Guest
Dec 28, 2017 Dec 28, 2017

Copy link to clipboard

Copied

Hello all!

My problem in one sentence: how to generate .pdf files with Windows batch (.bat) (or any other programming language) with support for CP1250 (and others).

Some background:

Over at dostips.com forum (dedicated to Windows batch scripts (.bat)) I asked a question (https://www.dostips.com/forum/viewtopic.php?f=3&t=8289) if anybody had already generated .pdf files from .txt files with Windows batch (.bat).

One expert found very old program (together with C source) made by Phil Smith (written in 1996) http://www.eprg.org/pdfcorner/text2pdf/. According to the author it supports only 7-bit characters. Looks like it can support 8-bit characters if

/Encoding /WinAnsiEncoding

is added. According to the PDF reference CP1252 is supported with /WinAnsiEncoding (this was only tested with one 8-bit character (umlaut character Ü) and it seems to be OK.

This dostips.com expert (Steffen) managed to 'port' it to batch. Looks like it can only support CP1252. I would like to have a support for CP1250 (or maybe other Code Pages, too). Only single byte characters are used (0x00 - 0xff).

CP1252 is lacking 'ccaron' character for example (čČ). ŠŽ are supported.

We tried to embed 'Lucida Console' font without success.

I don't want to use print to .pdf or something. When I generate a 'report' with my .bat  (result is a classic .txt file) I want to be able to generate .pdf file from it.

And here are my questions:

1. why are there some 'binary' bytes after the '%PDF-version'? Usually start with % (for a comment) and some additional bytes (0xe2e3cfd3))? What is their purpose? (this has nothing to do with this C source or .bat port (they don't have it) - just noticed this when some .pdf files were generated with Word or other software)

2. how can 'lucida console' (or Courier new or any other monospaced fonts that are allowed to be embedded) be embedded in full? This would be the easiest way and not to make selections which characters are used in a document. What objects would be required in a .pdf file to have this supported? Of course I prefer /AsciiHexDecode (or maybe /Ascii85Decode) because it is easier to write these characters with .bat.

Thank you for any info.

Saso

Views

1.3K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
no replies

Have something to add?

Join the conversation