• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

How do I create my own eBook?

New Here ,
Nov 25, 2009 Nov 25, 2009

Copy link to clipboard

Copied

I have many documents from school that I'd like to scan, convert to PDFs and then compile together into one PDF file. The problem is that whenever I scan a document, I can't compress it down enough without sacrificing quality, so I can't have a small PDF file. I have a few eBooks, and, looking at their properties, somehow the creators of those eBooks were able to compress their PDFs down enough without sacrificing any quality to the documents whatsoever. Here's an example:

one eBook that I have is 696 pages, the filesize is 4,159,863 bytes, and it looks like it was simply scanned and put together, because the quality of the eBook matches the book that it came from (which I happen to own). True, the images in the eBook are black and white (some of the images in the book are in colour), but there is no noticeable depreciation in their quality. Whoever created it was using the application PScript5.dll version 5.2, the PDF producer Acrobat Distiller 6.0.1 (Windows), the PDF version is 1.5 (Acrobat 6.x), and was NOT optimized for fast web view. How did the author get the page size so small (kb wise) yet retain quality? If one were to do the math: 4,159,863 bytes divided by 696 pages equals 5976.81, then divided by 1000 equals 5.97681. Almost 6 kb per page? How is that possible?

The smallest I could compress scanned images was about 20 kb, and the quality was terrible. In fact, the only way I could shrink them to that size was to scan them in Black and white (1 bit), and then distill them as much as I dared. The vast majority of documents that I'm scanning are black and white text with very little in the way of graphics.

Here are the programs that I have at my disposal:

Jasc Paint Shop Pro 8 and HP Director (for scanning with my HP PSC 1210 all-in-one printer, which has OCR technology)

Microsoft Word 2000 (9.0.2720)
Adobe Acrobat 5.0.5 (10/26/01) (yes, it has paper capture)

Acrobat Distiller 5.0.5.2001101100

It is driving me absolutely crazy that I cannot figure this out. Even creating my own job option in Distiller and adjusting the settings didn't help.

It should be possible, with the programs that I have, to get the results I want. For those of you who believe I should upgrade to something beyond Adobe Acrobat 5, here's another example for you:

There used to be a link on www.iartonline.ca (which no longer exists) to a PDF that is 13 pages long, created with the application Microsoft Word 8.0, the PDF producer is Acrobat Distiller 4.0 for Windows, the PDF version is 1.2 (Acrobat 3.x) and has a file size of 79,719 bytes. In contrast, I scanned two school documents (using OCR) into Microsoft Word, corrected any errors, and converted it to an Adobe PDF from there. The resulting size (for TWO PAGES) was 25 kb.

So, in conclusion, it HAS to be possible for me to scan my documents, retain their quality while compressing their size, and compiling them into single file PDFs.

I apologize for the lengthy explanation (this is why I couldn't ask the question on WikiAnswers), but hopefully this will save others interested in answering from asking irrelevant (to me) questions.

Thank you for your time.

Views

9.2K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 28, 2009 Nov 28, 2009

Copy link to clipboard

Copied

The most books aren't scans.


Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Nov 29, 2009 Nov 29, 2009

Copy link to clipboard

Copied

@Bernd:

Yes, I know that "The most books aren't scans." What I'm asking is how do I create books that ARE scanned (because I've seen many that are)?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 28, 2009 Nov 28, 2009

Copy link to clipboard

Copied

Hi,


You reference a PDF that is the output of MS Word; not a PDF that has scanned images brought in.
Oil & water - not miscible.
PDF, coming from an authoring application will have a small file size.
In most cases, any images brought into the authoring file will have been optimized for effective resolution and usability by the targeted end-users.


You are scanning. Scanned images into PDF = large file footprint.
As you noted - too aggressive of a compression choice makes the image of text or graphic content unusable.
Aggressive compression reduces file size by destructive removal of pixels.


For your situation, you want an effective resolution of 300 ppi minimum.
Keep in mind, B/W scan has smallest footprint, Grayscale or Color goes larger.
For anything usable, you will have to accept a larger file size (it is inherent to using scanned images).
Think raster vs vector...


At 300 ppi, you can build a fairly good cataloged index of the OCR'd text content.


Note that "eBook" connotes" EPUB, MOBI, etc. - formats containing content sourced from an authoring application; not from scanned images.


Be well...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Nov 29, 2009 Nov 29, 2009

Copy link to clipboard

Copied

@CtDave:

Alright, instead of using "eBook", I shall use "multi-page PDF file", as in "How do I create my own multi-page PDF file that is small in size yet retains the quality of a scanned image and is fully text-editable?"

Keep in mind that small, multi-page PDFs (that are fully text-editable and retain the quality of a scanned image) DO exist.

Yes, I know that using 300 ppi minimum is an effective resolution.

Yes, I know that a black-and-white scan has the smallest footprint, while grayscale or color is significantly larger.

So, like I said, what can I do with the programs that I have? What additional programs might I need?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 29, 2009 Nov 29, 2009

Copy link to clipboard

Copied

You have hardware and software adequate to the task of developing scanned images that will end up in PDF and post processing the PDF(s) into functionally interactive documents / document collections.
Acrobat's OCR can provide searchable text (presuming the hardcopy text was from a word processor/text editor - usable OCR of handwritten text is possible with dedicate software ($), but not Acrobat OCR).
If the OCR option for Formated Text & Graphics or ClearScan (AA9) is used, the user can perform edits to suspect OCR output.
Having used Acrobat 5.x (full) for similar activities in the past I know it is capable of supporting the creation of interactive PDF collections containing scanned content.


However, "fully editable" is a no show in PDF as PDF is a page description language and not a word processor/text editor format.
Particularly so when dealing with OCR output in a PDF.
Even  Section 508 accessible, tagged output PDF produced from Adobe applications or MS Office applications is not "fully editable".


~ some observations ~
If you've not already done so, play with Acrobat 5's Optimization Options for Acrobat Scan.
Effectively, these are the same from AA4.x through AA9pExtd.
Avoid the default "Automatic" selection.
Avoid "Automatic" & "Aggressive"
Avoid "Adaptive" for Color/Grayscale.
Play with the Filtering settings to see what meets your needs.


Be well...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 08, 2009 Dec 08, 2009

Copy link to clipboard

Copied

@CtDave:

Thanks for replying. I know that once I scan a paper document and convert it to a PDF, I can make it fully text-editable using the "Paper capture" option in Adobe Acrobat 5 (depending on what resolution was used for scanning the document, as the Paper capture option can become fussy about that sort of thing). The problem is this:

the higher the resolution I use for scanning, the bigger the file. In the end, I want very small PDFs that retain the quality of the scanned image so that I can compile many of them together into one PDF file that is no more than, say, a few hundred kilobytes. As an example, I have a PDF that is 23 pages long, contains 3 pictures, and is 197 kb. That document can be found here: http://www.arthurjonesexercise.com/Unpublished/Colorado.pdf

Also, my copy of Adobe Acrobat 5.0.5 does not have Optimization options for Acrobat Scan.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 18, 2009 Dec 18, 2009

Copy link to clipboard

Copied

You have to use a new release of Adobe Acrobat. thanks.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 19, 2009 Dec 19, 2009

Copy link to clipboard

Copied

@tiensindo-com:

You're sure there's no way to get what I want with the version that I have? Absolutely sure?

Also, is there a specific version of Acrobat I should be looking for, or will any newer version do?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 20, 2009 Dec 20, 2009

Copy link to clipboard

Copied

Thanks for this I will give it a go.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 25, 2010 Jan 25, 2010

Copy link to clipboard

Copied

I don't understand why this question hasn't been answered yet. Am I not explaining it in clear enough terms or in plain enough English?

Since it seems to be such a big question to answer, I'll break it down:

Once I have scanned an image, how do I convert that image into a small PDF?

An example of what I mean: if I scan one sheet of paper...let's say it's a sheet of band music, meaning some text & mostly graphics in black and white...what settings would I use to convert that scanned image (that 'scanned image' could be in bitmap, jpg or just about any other kind of image file format) into a small PDF file? By 'small', I mean 20 kb or less (1 sheet of black and white text & graphics is DEFINITELY capable of being compressed down to that size with little to no loss of visual quality). I've messed around with various settings in Acrobat Distiller 5 AND 6, and the resulting PDFs have been WAY too big (100, 200, some even 400 kb). I've even done a scan of a sheet of band music, saved it as a jpg (82 kb), tried to open that jpg file in Acrobat 5, and received the message, "insufficient data for an image". Absolutely impossible, but it happened anyway. I've been able to open jpg files in Acrobat that were HALF that size and been able to save them as PDFs (of course, the quality was almost completely gone by that point).

It's 2010. There are people out there who have been able to scan & save images as small PDFs (under 100 kb per sheet while retaining the quality of the scanned image) since at LEAST 2001. I fully believe that it's possible with the programs I have to accomplish such a goal, but I've exhausted almost every possible combination I can think of when it comes to scanning and Distiller settings.

Once again, here's what I have at my disposal...

An HP PSC 1210 All-in-one printer (has OCR technology)

Adobe Acrobat 5.0.5

Distiller 5 and Distiller 6

HP Director (a program for scanning and setting scanning parameters)

Jasc Paint shop pro 8 (also for scanning)

There's definitely someone out there who knows how to do this. Mods?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 24, 2010 Mar 24, 2010

Copy link to clipboard

Copied

Please ask the programmer to

find the way out.


Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 19, 2010 Jun 19, 2010

Copy link to clipboard

Copied

Now I have Irfanview. Would that be of any help to anyone in explaining to me how to scan pages, make them text-editable and compress them to a very small file size while retaining quality? I want to make this as simple as possible for anyone else out there who may be asking the same (or similar) question(s). So, if someone could reply to this and say (for example):

"Since you have Irfanview, follow the steps in Outline A (shown below) in order to scan a page of a book (for example) that is text-only, make it OCR-readable and retain the quality of a scanned image while compressing it to the smallest filesize possible.

Outline A.

If you want to scan a page that has both text and graphics, follow the steps in Outline B (shown below) to attain the same results.

Outlne B."

I see no reason why this isn't possible with the programs I have (as listed a few posts above). Anyone have any ideas?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 28, 2010 Jun 28, 2010

Copy link to clipboard

Copied

i preffered to scan the old book then paste in office. Then publish to pdf.

Or you can outsource it to somebody else.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Oct 24, 2010 Oct 24, 2010

Copy link to clipboard

Copied

Can Google spider my PDF's if I include them in my sitemap or from a link to my site?  Will Google be able to find other links to my site via the PDF?  I guess at the end of the day, I am wondering if Google will be able to find content relevant to a search that may be stored in my PDF files.

Lawrence Kobescak

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Nov 11, 2011 Nov 11, 2011

Copy link to clipboard

Copied

If you uploaded your ebook or PDF file in compressed/zipped form only for download than Google spider un able to read or index your content.

Or

If you uploaded your ebook or PDF file in raw PDF file than Google spider read or index your PDF content and save it to the relevant search queary.

That's it.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Nov 11, 2011 Nov 11, 2011

Copy link to clipboard

Copied

Follow this interesting tutorial to know how easy is to create your own ebook using Adobe Acrobat:

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Nov 15, 2011 Nov 15, 2011

Copy link to clipboard

Copied

LATEST

Thanks Lisa, but not quite what I was looking for. I know how to create PDFs from Microsoft Word, but your video was helpful when it came to things like bookmarks, etc.

As I said in my original post, I'm looking for a way to create small (file size-wise), OCR'd PDF files from scanned images. The kind of answer I'm looking for is something along these lines:

"Okay, since you're using HP Director, set the PPI to 300, scan in grayscale," etc. "When using Irfanview, set the parameters to this, this, and that, and then convert to PDF. Then use Adobe Acrobat 5 and use this setting, that setting, and that setting."

Something simple and straightforward, almost like a formula. With the programs I have, it should be no problem for someone who's done this a thousand times to give me a quick answer that's effective.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines