• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Can metadata be altered or supplemented?

Explorer ,
Apr 06, 2011 Apr 06, 2011

Copy link to clipboard

Copied

Many eBook files have incorrect metadata, such as "author".  Is there any way to overwrite or supplement the metadata info in a way that ADE can display it?  I hate the degradation of quality that is part of the crowd-sourcing / weak publisher responsibilities world of ebooks.

Can't this behave a little more intelligently, ideally more like the metadata in Lightroom?  Sidefiles if necessary (preferably not)?  The two formats I am most concerned about are PDF and EPUB.

Views

3.0K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Apr 06, 2011 Apr 06, 2011

Copy link to clipboard

Copied

For what it might be worth, if you read your books using the txtr app (www.txtr.com) the way you load them into your device is you first load them or email them to the txtr site, and then pull them into your device from within the app on your device.

Once they are in your log on the txtr site, you can go the books individually and edit the titles and authors as you choose, as well as add catagories (labels) to the books. Then these edited data are what shows up on your device.  it is a nice feature. I do not know how this might work with other devices.  I am not related to txtr in any way. Just found they have a useful app, especially for complex pdfs in iOS.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Apr 07, 2011 Apr 07, 2011

Copy link to clipboard

Copied

Thanks, that's an interesting suggestion but seems to be iPhone-specific.  I'm just using a regular eReader (Sony, in this case) via ADE.  Does anyone know of a more general-purpose tool that can modify PDF or EPUB info?

For a PDF generally, you can modify much of the metadata (using Adobe Acrobat directly with ctrl-D), though I can't identify what field maps to the ADE "Publisher" field.  (Is there a correspondance table anywhere?).  This does not work on documents that are password-protected for writing (e.g., downloads from PoemHunter).  And, of course, you need more than Acrobat Reader to do that.

For EPUB files, I'd rather do this from within ADE, but will certainly use external tools if available even though that introduces another step.

This post summarizes the problem nicely: http://ipadtest.wordpress.com/2010/03/31/the-epub-ebooks-metadata-mess/ , even though it's Apple-specific.  For bibliophiles, it's like for photographers -- we're talking about tens of thousands of books, ultimately, and we need reliable/modifiable metadata accordingly.  I understand that ADE is an early version, but I'd like to see it improve drastically in this area or someone will come along and do it better -- the metadata requirements issues are too important and probably too easy to do.

One recommended tool I have located is Sigil.  Let me show you some of the problems...

A) Inadequate author field information and display

1) I downloaded "Some Experiences of an Irish R. M." from Project Gutenberg as an EPUB file (free).  The authors are Somerville & Ross or, more explicitly, Edith Somerville and Martin Ross.

2) When imported into ADE, the Author field displays: "Martin Ross".

3) When examined under Sigil, the Author field displays: "Ross, Martin; Somerville, E. Oe. (Edith Oenone)"

Apparently, ADE pulls only one item from the Author field, and the creator of the EPUB book filled it in alphabetical order by last name, rather than in the proper order, bibliographically speaking.  So there are multiple problems, not all of which are ADE's fault, but ADE should at least display (and make searchable/ssortable) all Authors, and should provide a way to modify the fields they display.  I have downloaded 3 books by Somerville & Ross (there are many more) and no two of them display the same authors so they don't sort together.

B) Inadequate Author sorting

On a related front, I purchased a few EPUB books which were set up by commercial publishers at least (vs Google or crowd-sourced).  Their decisions vary between Author: last-name, first-name and Author: first-name, last-name.  Either ADE should properly interpret alternative sorts on Author by first vs last, or I need to change the metadata to be consistent.  For the example in (A), ADE replaced the provided "last,first" order of "Ross, Martin" with "Martin Ross"; why didn't it do that for the commercial books?

C) Inadequate Title sorting

These are BOOKS not cans of soup.  We all understand that Titles that begin with "the", "a", "an" are more properly displayed with that character at the end ("Fine and Private Place, A").  That should be a sort option, at least.

D) User annotation

I would very much like three or more fields for user annotation.  These could be used for things like: volume/series (as in Volume 3 of 10), publication date, etc.  In this infant field, these are needed both for basic book information to supplement the inadequacies of the publishers and for personal reference data.

E) DRM issues re: metadata

Tools like Sigil won't even open DRM-protected files.  I can understand why external tools might not be proper for this purpose, but it underscores why ADE must supply some basic metadata editing/substitution tools.  Any serious book collector will want to "fix" the bibliographic metadata that is inadequately and inconsistently supplied for their collection.

I was frankly, if naively, shocked to discover that applications like ADE are nowhere near this very basic functionality yet.  ADE should be providing the equivalent of the consistent library card catalogue card entry, modifiable by the user.  There's no excuse for not doing that for a medium that is centuries old and whose requirements are so well understood.  In fact, the content/format of a card catalogue card, plus a few user-added fields, would be an excellent model to follow.

I chose ADE because I'm betting that Adobe will step up to the challenges of books they way they have for photos.  Please don't disappoint me.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Apr 07, 2011 Apr 07, 2011

Copy link to clipboard

Copied

This is an interesting discussion, but....

First, why do you think you are qualified to do the modifications?

Next, why do you think you should modify the data? Wouldn't reporting the

errors be enough?

Then, there's the legal side: check the Digital Millenium Copyright Act

provisions for corrections and modifications to the data. This isn't

Wikipedia....

=================

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Apr 07, 2011 Apr 07, 2011

Copy link to clipboard

Copied

This is why I feel qualified, aside from an academic background:  And, no, reporting the errors is not enough.  http://www.ljndawson.com/permalink/2010/01/21/Metadata_More_Important_Than_Ever.html

Basically, no one is doing it reliably or to standards at the fundamental data level, and entire industries exist for issuing corrections.  I am perfectly well-qualified (if it's anyone's business) to correct the data in my thousands of books.

I don't need Adobe to fix data content errors, but I do need them to recognize that they are prevalent in real world data and to provide me with the tools necessary to deal with them myself.  Somebody will, and I'd rather it were Adobe.

Don't you ever fix the metadata in an MP3 file?  Isn't it ever wrong (like, all the time)?  eBooks are as bad, or worse, but the music db handlers (like iTunes) are much further along than the eBook db handlers (like ADE) in allowing both many more displayed fields, user modifications, and better (if inadequate and non-hierarchical) sorts.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Apr 07, 2011 Apr 07, 2011

Copy link to clipboard

Copied

Sorry - I still don't buy direct action, because, once something is

published, the author can complain, but should not be fixing stuff. It's

out of your hands and in Adobe's. You must go through them - or you become

a hacker under the law.

================

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Apr 07, 2011 Apr 07, 2011

Copy link to clipboard

Copied

This topic is about metadata, not DRM -- let's try to stay on topic.

Most of my ebooks are currently free and unprotected, and all my requests pertain to them equally well.  It has nothing to do with DRM or Adobe's rights.

I fail to see how DRM would come into it in any case.  If I buy a physical book and choose to write all over the title page, no harm is done.  If I buy an eBook and correct its inadequate title page information and add my own scribbled notes to my copy, no one's rights are violated.  DRM isn't a concept that makes corrections magically taboo, and nothing is being hacked. The fields are not set by the author, BTW, but by the publisher.

I refuse to tolerate having ebooks where the Title is "Poems of Robert Browning" and the Author is "Unknown".  I don't care if the original Author field is left intact and a "UserAuthor" field is added -- whatever works for the specifications that are evolving.  But low-quality metadata is not acceptable, not now and not in the future.  If I can't rely on publishers to supply sufficient information, I'll do it myself for my own ebooks.  I've done it for my thousands of CDs ripped into MP3s for my convenience.  The tools need to assist in this, not make it difficult, because if this one doesn't (ADE), others will -- there's no suppressing the requirement.  I personally will move from ADE to any application that does a good job of this, but I am hoping Adobe will step up.  I'm sure I'm not alone.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Apr 08, 2011 Apr 08, 2011

Copy link to clipboard

Copied

Metadata for ePub files is kept in the .opf usually at the root of the epub file.  The .opf is not encrypted (or otherwise protected by DRM).

The spec for the opf file is here: http://old.idpf.org/2007/opf/OPF_2.0_final_spec.html

Unzipping that file, editing it in a text editor and rezipping it back into the epub is certainly possible. 

When I need to do it, I do it by hand, but I realize that is not even close to ideal for everyone.

I don't know of any tools however that will just edit the metadata, that doesn't also want to read the book (and will run afoul of the encryption).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Apr 08, 2011 Apr 08, 2011

Copy link to clipboard

Copied

LATEST

Thanks, Jim.  I'm hoping ADE will come to include such tools.  I assume these forum requests are funneled to the development team in some fashion?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Apr 08, 2011 Apr 08, 2011

Copy link to clipboard

Copied

Jim Lester is an Adobe employee, so his comments concerning the technical

side should help.

As a legal consultant, I can see where you're coming from. If it's just

your copy, no harm and no foul. Once it's turned over to some other

organisation, however, they have the responsibility. DRM is a supplier -

middleman if you prefer - between the custodian of the text (which is your

publisher or Gutenberg, etc.) and the end user. DRM thus must transmit the

rights from the custodian to the end user correctly. So, if you as the

author have set privileges and Adobe/DRM does not transfer them correctly,

then you start with DRM and see whether their 'pass-through' is accurate.

If not, game over: they fix it. If so, then it's back to the publisher. A

mess to be sure, and you have every right to be unhappy about it.

The music side is more mature: we've had those issues sorted now for almost

7 years. ASCAP and others have pretty clear rules and guidelines. As the

ebook field matures, I think much of this nonsense will be sorted and

procedures will be streamlined and synthesized so the process is clear to

all. People will make mistakes, however, and those procedures need to have

error correction built into them.

=================

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines