-
1. Re: Spotlight Meta-Data Importer
marcopperman Oct 26, 2010 10:29 AM (in response to ascript_guy)Why, after FIVE YEARS (the time since people started asking for one), is there no Spotlight plugin for Indesign files??
-
2. Re: Spotlight Meta-Data Importer
coreworks May 23, 2011 3:23 PM (in response to marcopperman)Wholeheartedly agree. I've had a few recommend something called PageZephyr -- but it's a hundred to several hundred dollars. I would really like to see Adobe get rolling and get this arguably simple thing added to InDesign; the ability to search for text within .indd files without *having* to keep PDFs of each one around would save me a lot of time (and more than a decent amount of drive space).
-
3. Re: Spotlight Meta-Data Importer
John Hawkinson May 23, 2011 8:11 PM (in response to coreworks)InDesign files have XMP metadata already. Is this really not just a matter of configuration?
-
4. Re: Spotlight Meta-Data Importer
coreworks May 23, 2011 8:38 PM (in response to John Hawkinson)If the metadata is there, how does one get Spotlight to search within .indd files? If I remember the discussions I've read on the matter over the past half decade, it's up to Adobe to write one, and they haven't as of this writing (insofar as I know, unless it's part of CS5 -- I've not upgraded to CS5 yet).
-
5. Re: Spotlight Meta-Data Importer
John Hawkinson May 24, 2011 7:49 PM (in response to coreworks)OK, so, I looked into this in a bit more detail.
It's up to someone to write, but it need not be Adobe.
Spotlight ships with ~20 importer each of which declares a set of Uniform Type Identifiers (UTIs) upon which it operates. And each plugin importer gets called by Spotlight when a new file of it's UTI type appears, and the plugin exports the metadata to Spotlight's metadata database.
It would appear that none of the standard spotlight plugins simply read the provided file and look for XMP data. Though probably some of them use some dispatch mechanism and then do so (like Image).
There are examples where people solve this problem for movie files by editing the Info.plist associated with the Quicktime mdimporter to add more UTIs. This does not work to add the INDD UTI to Image.mdimporter or Quicktime.mdimporter, but perhaps it's a bit close. It ought not be difficult to merge the sample XMP reading library from the XMP toolkit with the sample Spotlight plugin to make this work. But it is a bit of developer effort.
Also, more annoyingly, InDesign does not define a standard UTI for INDD documents, so unlike a Photoshop file that is com.adobe.photoshop-image, an InDesign file is, on my system, dyn.ah62d4rv4ge80w5xequ. This is circumventable, but annoying.
Blah.
-
6. Re: Spotlight Meta-Data Importer
John Hawkinson May 25, 2011 4:58 AM (in response to John Hawkinson)OK, I was wrong.
Or if not technically wrong, fairly misleading.
A spent a while mucking around and have most of a spotlight importer that reads XMP metadata from arbitrary files using Adobe's XMP toolkit, and can write to the Spotlight database, and it gets invoked at the right times., etc., etc. [It doesn't actually do the writing, but its not much effort to make it do so.]
This addresses ascript_guy's desire from 2009:
For the Macintosh platform, I'd like Adobe to supply or make available a Spotlight Meta-Data Importer that would expose the XMP Meta-Data to Spotlight searches.
Unfortunately, this is not very useful, because the XMP metadata isn't very useful. As coreworks, points out the real utility comes from being able to search the text content of InDesign documents (though I'm not really sure how Spotlight deals with book-length stuff... But the XMP metadata doesn't have that. It doesn't even have a proxy or summary or short piece of that.
Here's a sampling of the metadata from Blue Square.indd, a sample InDesign layout that ships with the XMP SDK:
x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.1.2"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="" xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/" xmlns:stEvt="http://ns.adobe.com/xap/1.0/sType/ResourceEvent#"> <xmpMM:InstanceID>xmp.iid:0E49B4AD072068118C1493EC1DFA0B63</xmpMM:InstanceID> <xmpMM:DocumentID>adobe:docid:indd:af62ddc6-213f-11da-ac18-fe020baf4f13</xmpMM:Documen tID> <xmpMM:OriginalDocumentID>adobe:docid:indd:af62ddc6-213f-11da-ac18-fe020baf4f13</xmpMM :OriginalDocumentID> <xmpMM:History> <rdf:Seq> <rdf:li rdf:parseType="Resource"> <stEvt:action>saved</stEvt:action> <stEvt:instanceID>xmp.iid:0D49B4AD072068118C1493EC1DFA0B63</stEvt:instanceID> <stEvt:when>2011-05-25T06:51:10-04:00</stEvt:when> <stEvt:softwareAgent>Adobe InDesign 7.0</stEvt:softwareAgent> <stEvt:changed>/;/metadata</stEvt:changed> </rdf:li> <rdf:li rdf:parseType="Resource"> <stEvt:action>saved</stEvt:action> <stEvt:instanceID>xmp.iid:0E49B4AD072068118C1493EC1DFA0B63</stEvt:instanceID> <stEvt:when>2011-05-25T06:51:10-04:00</stEvt:when> <stEvt:softwareAgent>Adobe InDesign 7.0</stEvt:softwareAgent> <stEvt:changed>/metadata</stEvt:changed> </rdf:li> </rdf:Seq> </xmpMM:History> </rdf:Description> <rdf:Description rdf:about="" xmlns:xmp="http://ns.adobe.com/xap/1.0/" xmlns:xmpTPg="http://ns.adobe.com/xap/1.0/t/pg/" xmlns:xmpGImg="http://ns.adobe.com/xap/1.0/g/img/"> <xmp:CreateDate>2005-09-07T14:40:37Z</xmp:CreateDate> <xmp:ModifyDate>2011-05-25T06:51:10-04:00</xmp:ModifyDate> <xmp:MetadataDate>2011-05-25T06:51:10-04:00</xmp:MetadataDate> <xmp:CreatorTool>Adobe InDesign 7.0</xmp:CreatorTool> <xmp:PageInfo> <rdf:Seq> <rdf:li rdf:parseType="Resource"> <xmpTPg:PageNumber>1</xmpTPg:PageNumber> <xmpGImg:format>JPEG</xmpGImg:format> <xmpGImg:width>256</xmpGImg:width> <xmpGImg:height>256</xmpGImg:height> <xmpGImg:image>/9j/4AAQSkZJRgABAgEASABIAAD/7QAsUGhvdG9zaG9wIDMuMAA4QklNA+0AAA ... </xmpGImg:image> </rdf:li> </rdf:Seq> </xmp:PageInfo> </rdf:Description> <rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:format>application/x-indesign</dc:format> <dc:title> <rdf:Alt> <rdf:li xml:lang="x-default">Blue Square Test File - .indd</rdf:li> </rdf:Alt> </dc:title> <dc:description> <rdf:Alt> <rdf:li xml:lang="x-default">XMPFiles BlueSquare test file, created in InDesign CS2, saved as .indd and .pdf.</rdf:li> </rdf:Alt> </dc:description> <dc:subject> <rdf:Bag> <rdf:li>XMP</rdf:li> <rdf:li>Blue Square</rdf:li> <rdf:li>test file</rdf:li> <rdf:li>InDesign</rdf:li> <rdf:li>.indd</rdf:li> </rdf:Bag> </dc:subject> </rdf:Description> <rdf:Description rdf:about="" xmlns:xmpTPg="http://ns.adobe.com/xap/1.0/t/pg/" xmlns:xmpG="http://ns.adobe.com/xap/1.0/g/" xmlns:stFnt="http://ns.adobe.com/xap/1.0/sType/Font#"> <xmpTPg:Colorants> <rdf:Seq> <rdf:li rdf:parseType="Resource"> <xmpG:swatchName>Black</xmpG:swatchName> <xmpG:mode>CMYK</xmpG:mode> <xmpG:type>Process</xmpG:type> <xmpG:cyan>0</xmpG:cyan> <xmpG:magenta>0</xmpG:magenta> <xmpG:yellow>0</xmpG:yellow> <xmpG:black>100</xmpG:black> </rdf:li> </xmpTPg:Colorants> <xmpTPg:Fonts> <rdf:Bag> <rdf:li rdf:parseType="Resource"> <stFnt:fontName>Times-Roman</stFnt:fontName> <stFnt:fontFamily>Times</stFnt:fontFamily> <stFnt:fontFace>Regular</stFnt:fontFace> <stFnt:fontType>TrueType</stFnt:fontType> <stFnt:versionString>Times-Roman6.0d6e5</stFnt:versionString> <stFnt:composite>false</stFnt:composite> <stFnt:fontFileName>Times.dfont</stFnt:fontFileName> </rdf:li> </rdf:Bag> </xmpTPg:Fonts> </rdf:Description> </rdf:RDF> </x:xmpmeta>
So, if you happen to fill in the Author/Title/Keyword metadata in ID, then maybe that is useful. Or if the thumbnail image of the file is useful.
Any of it could get stuffed into the spotlight data, but to what end?
Doing any better requires being able to parse the INDD file format and getting text out to give to spotlight. That is not an easy task.
Markzware specializes in tools that understand the insides of InDesign (and Quark) files, so it's not surprising that they have a product for this (PageZephyr). And really, $100 doesn't seem too much to pay.
So, blah.
I suppose one could take the thumbnail and make a QuickLook generator out of it.
If someone wants to pull XMP metadata out of some other kind of file that's more useful and would like me to finish the spotlight plugin, you should let me know. Otherwise I don't see bothering.
Oh, I suppose another option is to use the SDK or Scripting API to have InDesign pull out the text when it is saving a document. This is counter to the Apple-mandated Spotlight philosophy:
A Spotlight importer must run entirely without interaction. You should not attempt to present any user interface or expect that the window server is running.
You should not expect your application to be running when your metadata importer is called. Importers can be called at any time to extract metadata from a file. Your metadata importer should be able to extract theinformation without any assistance from the application that created the file.I suppose we could break the rules, but even then it's annoying to do and would probably have bad performance.
-
7. Re: Spotlight Meta-Data Importer
Pickory May 26, 2011 1:26 AM (in response to John Hawkinson)Hello,
This some thing I would like to look at.
How do you get your meta data into your documents?
Thanks.
-
8. Re: Spotlight Meta-Data Importer
Pickory May 26, 2011 1:30 AM (in response to Pickory)Ooops, sorry John, I didn't see you last post.
I have been looking at the blue square stuff too.
-
9. Re: Spotlight Meta-Data Importer
John Hawkinson May 26, 2011 3:18 AM (in response to Pickory)Pickory, I can't tell if you have a question, or what it might be.
Can you be a bit more clear?
-
10. Re: Spotlight Meta-Data Importer
Pickory May 26, 2011 3:28 AM (in response to John Hawkinson)Hello John,
My questing was how do you enter your meta data, apart from the very basic stuff.
You have already answered by pointing out the spotlight plugin would not be able to index the content of the document.
-
11. Re: Spotlight Meta-Data Importer
John Hawkinson May 26, 2011 3:44 AM (in response to Pickory)I wonder if the terminoloy is confusing.
The content of the document is not properly "metadata."
In InDesign, you can edit and view most of the metadata with File > File Info.
You can view the XMP metadata (different from InDesign metadata) that Spotlight has with "mdls filename" in the Terminal. Or just type "mdls " (with a space at the end) and drag an icon into the Terminal from the Finder, and hit return.
-
12. Re: Spotlight Meta-Data Importer
coreworks May 26, 2011 4:00 AM (in response to John Hawkinson)I may be misrepresenting my ultimate goal -- most simply put, what I want to be able to do is have Spotlight (Mac OS) ability to search text within an InDesign (.indd) file. Currently Spotlight can only find a title if the searched text is in the title; e.g., if I'm looking for a résumé for Rob Zombie, and search for "Robert", I will not get a Spotlight return unless the name "Robert" is in the title of the file. However, if I've made a PDF of that InDesign document, Spotlight will find the text within the .pdf file.
So what I am pining for is for, according to what I've learned about Spotlight and its capabilities, is for Adobe to write a plugin for Spotlight to be able to search, I guess, text strings within a .indd file. Currently, the only way I've heard of to search within InDesign files is a $100 to $200 program called "PageZephyr".
-
13. Re: Spotlight Meta-Data Importer
John Hawkinson May 26, 2011 4:21 AM (in response to coreworks)coreworks, you were clear enough. It seemed like some others in this
thread might have use cases that didn't require the full text.
Oh, I suppose one option that might meet your needs but is really
terrible would be to look through the InDesign document for anything
htat looks like text. This would give you a lot of false positives and
some really ugly stuff in your spotlight database.
For an example, go to Terminal.app and type in
strings -10 filename.indd
be prepared for many many screenfulls of output (e.g. a test I just
ran was 500 screenfuls). It certainly wouldn't be hard to tell
spotlight that was the full text of your document. It'd probably
enable searching for Rob Zombie, but might have other negative consequences.
-
14. Re: Spotlight Meta-Data Importer
John Hawkinson May 26, 2011 10:28 AM (in response to John Hawkinson)coreworks, you may be in luck!
Doing any better requires being able to parse the INDD file format and getting text out to give to spotlight. That is not an easy task.
It turns out it's actually nowhere near as bad as I thought.
Maybe I'll have something to test tonight or tomorrow.
-
15. Re: Spotlight Meta-Data Importer
John Hawkinson May 28, 2011 12:46 AM (in response to John Hawkinson)OK.
ALPHA-QUALITY SOFTWARE ALERT
Download http://web.mit.edu/jhawk/tmp/InDesignImporter-0.1alpha.dmg
and install InDesignImporter.mdimporter into ~/Library/Spotlight. (Or the /Library version if you want to live dangerously.)
I'm not sure if spotlight will automatically reindex old files.
This is alpha-quality software. It's prototyped as a slow and rather painful perl script that walks through the INDD file looking for things that look like strings (generally they start with @-signs), and then outputting them to Spotlight. It will produce some false positives, perhaps things like the names of styles and fonts that are encoded in your InDesign document. It will probably miss some strings. It might even crash the Spotlight importer process (mdimport).
You can force a file to be indexed by typing
mdimport /Users/myname/path/to/file.indd
and if you add -d 1
mdimport -d 1 /Users/myname/path/to/file.indd
it'll tell you which spotlight importer is being used, and -d 2 will show you the metadata it finds for the file.
I didn't actually bother pulling out the XMP metadata, though that's "easy." I spent most of the time bashing on the full text part.
Oh, it also screws up unicode characters. In part because it excludes them as part of its heuristic for what is a string and what is not, but that's kind of messed up. Anyhow, it outputs them as literal "U+2019" for a right apostrophe.
Anyhow, let me know how it works for you. Oh, yeah, it's slow. Because it wasn't really written efficiently...
Oh, and it only looks for CS5 [and CS5.5] files (type IDd7). That'd be easy to change, by editing the Info.plist file.
Let me know how it works. Not really sure if there's much point in making it better...easy to do though.
-
16. Re: Spotlight Meta-Data Importer
John Hawkinson Jun 1, 2011 5:59 PM (in response to John Hawkinson)Helooooo?
-
17. Re: Spotlight Meta-Data Importer
Harbs. Jun 2, 2011 3:58 AM (in response to John Hawkinson)Hi!
Harbs
(Who's too swamped with work to look at this fascinating piece of software...)
-
18. Re: Spotlight Meta-Data Importer
coreworks Jun 2, 2011 6:37 AM (in response to John Hawkinson)Ditto what Harbs said -- thanks for doing thing, but am in the middle of a massive work flow (six separate engineering clients; working on a handful of standard forms proposals for each) and can't use something experimental at the moment. Bookmarked and will jump on it soon as I get over this hump.
-
19. Re: Spotlight Meta-Data Importer
RKSinNC2 Jun 22, 2011 10:17 AM (in response to John Hawkinson)Tried out your Spotlight plugin.
Using: mdimport -d 1 /path/to/file.indd
I get: Segmentation fault
Using: mdimport -d 2 /path/to/file.indd
I get: (Info) Import: Import '/Volumes/path/to/file.indd' type 'edu.mit.jhawk.adobe.indesign-document' using '/Library/Spotlight/InDesignImporter.mdimporter'
Segmentation fault
File doesn't seem to get indexed, as Spotlight doesn't find text within the file.Using Mac OSX 10.6.7 and InDesign CS5 (7.0.4)Rodney -
20. Re: Spotlight Meta-Data Importer
John Hawkinson Jun 22, 2011 2:10 PM (in response to RKSinNC2)Oh, finally, a tester! There seems to be some screwup where I can't
seem to build a version that works on both 10.5 and 10.6, possibly
relating to 64-bit, but possibly not. Which is your system, I'll
build a version for it.
Alpha quality...
-
21. Re: Spotlight Meta-Data Importer
John Hawkinson Jun 22, 2011 8:05 PM (in response to John Hawkinson)OK, I think it was just 10.6, not 64/32-bit. Try the version now, Get Info should called it 0.1c.
Thanks for testing.
-
22. Re: Spotlight Meta-Data Importer
RKSinNC2 Jun 23, 2011 8:10 AM (in response to John Hawkinson)I tried the new version and it seems to work. If an InDesign file has extended text (sentences, paragraphs, etc.) they are usually indexed. With some files that only have a word or two per paragraph (I layout business forms, so I have a lot of these), it seems the only things indexed are font and color swatch names.
Rodney
-
23. Re: Spotlight Meta-Data Importer
John Hawkinson Jun 23, 2011 5:11 PM (in response to RKSinNC2)OK...is that a big deal? I assume mdimport -d 2 shows that it is not finding the data you're referring to? If you want to send me the file I can look at fixing that...
I would certainly like to get rid of font and swatch names, but I don't know how to distinguish them from other strings in the file. Alas...
-
24. Re: Spotlight Meta-Data Importer
[Jongware] Jun 24, 2011 2:05 AM (in response to John Hawkinson)John Hawkinson wrote:
[..]
I would certainly like to get rid of font and swatch names, but I don't know how to distinguish them from other strings in the file. Alas...
About the only way is to parse the entire ID file header & Internal Object list, and filter out the plain text objects (which are scattered all over the file). The way to distinguish "text objects", by the way, from other objects such as color names, font lists, and spell check exceptions, is by comparing their object IDs to the ones defined in the SDK. And -- just to add to the phun -- there may be different definitions for different versions of ID.
At that level, it's most certainly not a trivial thing to write, I can tell you that.
-
25. Re: Spotlight Meta-Data Importer
RKSinNC2 Jun 29, 2011 11:35 AM (in response to John Hawkinson)FYI, when I use the "mdimport -d 2" command in Terminal, it may only show some of the text that is being indexed. I've discovered that words not reported by using that command are still being indexed.
So your mdimporter is doing a pretty good job.
Rodney
-
26. Re: Spotlight Meta-Data Importer
John Hawkinson Jun 29, 2011 12:27 PM (in response to RKSinNC2)That is...not what I would expect. But my Spotlight importer can only
hand the data off to Spotlight. Whether Spotlight tells you all of
the data it uses is a different question that I don't have control
over, as far as I know.
I got some tips about parsing the file format so I may be able to improve
things. But not this week, I'm afraid.
But if you want to see what strings it is finding, you can run the
embedded perl script directly.
~/Library/Spotlight/InDesignImporter.mdimporter/Contents/Resources/idstrings.pl myfile.indd /dev/stdout
for instance.
-
27. Re: Spotlight Meta-Data Importer
darrenoia Jun 10, 2014 10:45 AM (in response to John Hawkinson)John, a very belated response — apparently I installed this plugin a while ago and it worked so well I didn't even remember I had. Kudos for great work. Somehow it's stopped working for me but I'm trying to reinstall now.




