• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

can I write the content of a pdf-file in a coldfusion Variable?

Explorer ,
Jul 10, 2008 Jul 10, 2008

Copy link to clipboard

Copied

Hi,
I have a question evaluating pdf-files.

Is it possible to get a pdf-file into a Coldfusion variable to evaluate the content e.g. with a Regex, to extract some Information as a string to store this in a Database.

I have to evaluate lots of pdf-files, which have specific strings in it.
I have to find these strings and check against a database whether they are already present in the database.

I have seen, that i can evaluate the properties of the pdf-file with <cfpdf action: "read">. But I I do not understand how to get the body of the pdf-file into a variable.

Any hint is highly appreciuated!
Malvina.
TOPICS
Advanced techniques

Views

487

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Jul 11, 2008 Jul 11, 2008
sorry, this does not work.

With this code:

<cfpdf action = "read" source = "dok_1.pdf" name = "mypdf">
<cfdump var="#mypdf#"/>


You shouldn't expect it to work. You can't dump a binary like that.

That was just a part answer to a part question. To peek into the text of a PDF, you may use the utility Daverms recommends.

If you choose to do it yourself, use Coldfusion 8's DDX functionality. Here is an example to illustrate. The folder ddxTest contains the files myDDX.ddx, textFromPDF.cfm an...

Votes

Translate

Translate
Community Expert ,
Jul 10, 2008 Jul 10, 2008

Copy link to clipboard

Copied

... how to get the body of the pdf-file into a variable
<cfpdf action = "read" source = "mydoc.pdf" name = "mypdf">


Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 10, 2008 Jul 10, 2008

Copy link to clipboard

Copied

thank you for your immediate reply, but,
sorry, this does not work.

With this code:

<cfpdf action = "read" source = "dok_1.pdf" name = "mypdf">
<cfdump var="#mypdf#"/>

I get this result:
Everything, but no text of the document.

PDFDocument
Application name of application
Author bimbam Verlag GmbH
CenterWindowOnScreen [empty string]
ChangingDocument Allowed
Commenting Allowed
ContentExtraction Allowed
CopyContent Allowed
Created D:20080710
DocumentAssembly Allowed
Encryption No Security
FilePath [empty string]
FillingForm Allowed
FitToWindow [empty string]
HideMenubar [empty string]
HideToolbar [empty string]
HideWindowUI [empty string]
Keywords [empty string]
Language [empty string]
Modified [empty string]
PageLayout SinglePage
Printing Allowed
Producer [empty string]
Properties [empty string]
Secure Allowed
ShowDocumentsOption [empty string]
ShowWindowsOption [empty string]
Signing Allowed
Subject [empty string]
Title Rheinische Angler-Zeitschrift
TotalPages 1
Trapped [empty string]
Version 1.3

Maybe i do not understand the cfpdf tag the right way.
What i want is a kind of pdf-to-text conversion.
Do I have to use the processddx action? I do not think so. But there is a property DocumentText .. ?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Jul 10, 2008 Jul 10, 2008

Copy link to clipboard

Copied

Hi,

Try Ray's "PDFUtils" utility..

A nice little blog on this can be found here

HTH

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 12, 2008 Jul 12, 2008

Copy link to clipboard

Copied

Thank you for your instant reply.
The blog helps me to understand the DDX-beast.
now the ddx code is no confusing anymore.
Problem solved!

thanks again
Malvina

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 11, 2008 Jul 11, 2008

Copy link to clipboard

Copied

sorry, this does not work.

With this code:

<cfpdf action = "read" source = "dok_1.pdf" name = "mypdf">
<cfdump var="#mypdf#"/>


You shouldn't expect it to work. You can't dump a binary like that.

That was just a part answer to a part question. To peek into the text of a PDF, you may use the utility Daverms recommends.

If you choose to do it yourself, use Coldfusion 8's DDX functionality. Here is an example to illustrate. The folder ddxTest contains the files myDDX.ddx, textFromPDF.cfm and myDoc.pdf, an arbitrary PDF that contains the text.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 12, 2008 Jul 12, 2008

Copy link to clipboard

Copied

Great!
Thank you. Did it as you said. Success.
DDX is not soo complicated as it seems from the first look at it.
Nice weekend.

Malvina

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 13, 2008 Jul 13, 2008

Copy link to clipboard

Copied

LATEST
I'm glad the problem is solved. A nice weekend to you, too.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation