9 Replies Latest reply on Feb 25, 2011 8:42 PM by DFBurns

    catalog:batchGetRawMetadata ()

    johnrellis Most Valuable Participant

      Last July, several of us observed that in LR 3.0 Windows, catalog:batchGetRawMetadata () was very slow, processing roughly 700 photos/second: 

       

      http://forums.adobe.com/message/2958375#2958375

       

      I just redid some timings with LR 3.3, and it is much, much faster now, comparable to what was reported for 2.7.  Here are the rates for various batch sizes (2x2 2.13 GHz i7 processor, 7200 RPM disk), retrieving the field "keywords":

       

      Batch sizePhotos/second

      1

      100
      4349
      161144
      643300
      2565368
      10245927
      40966480

        • 1. Re: catalog:batchGetRawMetadata ()
          areohbee Level 5

          I wonder if Adobe:

           

          - fixed this and didn't realize it

          - realized it but didn't say anything

          - said something but I/we missed it.

           

          In any case, it definitely seems worth mentioning - thanks John,

          R

          • 2. Re: catalog:batchGetRawMetadata ()
            DawMatt Level 3

            Thanks for the update John.  I know Lightroom 3.3 seems faster than previous dot releases but it is nice to have something quantitative to back it up.

             

            Rob, I think it was option 3 but required some reading between the lines on our part.  I'm sure they are aware of the speed increase here and it was tied to some of the other speed enhancements for the product (e.g. adjustment brushes or catalog responsiveness in general).  They have mentioned the general speed improvements in the release notes but haven't called out every single positive impact this has had across the product. 

             

            Matt

            • 3. Re: catalog:batchGetRawMetadata ()
              JW Stephenson Level 4

              I have recently moved to using the batch commands after I realized how much slower the catalog:GetRawMetadata() and the catalog:GetFormattedMetadata() were when moving from version 2.5 to 3.0.  They used to be instantaneous inside a loop now they are costing several seconds per 1000 images.  I measured it at approximately 20-30 times slower.  As an example (without having to have two version loaded to test), the soon to be depreciated photo.path retrieval time is 30 times faster than the catalog:GetFormattedMetadata("path") command.  If I had to guess, there is some database security happening here that is more secure in the newer version.  Fortunately, the batch commands are quite fast now.

               

              As a side note, why can't the formatted or Raw Version of the batch command (which ties to the non-batch commands) include all possible metadata so I don't have to call both to get the lasteditdate and scene for example?  Why are some fields overlapping and other not?  It seems to me that the Raw command should be all inclusive and the Formatted be only those fields that people commonly need and want to have pre-formatted.   Hmmm ... maybe there is a more fundamental reason why they are separate with many overlapping fields.  Just curious?

               

              Jeff

              • 4. Re: catalog:batchGetRawMetadata ()
                johnrellis Most Valuable Participant
                Why are some fields overlapping and other not?  It seems to me that the Raw command should be all inclusive and the Formatted be only those fields that people commonly need and want to have pre-formatted.

                I've wondered the same thing -- just seems like suboptimal design.

                 

                Also, note that passing "nil" does not retrieve all the fields, contrary to the documentation. You need to pass an explicit list to be sure to get the fields you want.

                • 5. Re: catalog:batchGetRawMetadata ()
                  DawMatt Level 3

                  johnrellis wrote:

                   

                  Why are some fields overlapping and other not?  It seems to me that the Raw command should be all inclusive and the Formatted be only those fields that people commonly need and want to have pre-formatted.

                  I've wondered the same thing -- just seems like suboptimal design.

                   

                   

                  I'm working on a plugin at the moment where these issues have impacted me.  I've raised the lack of overlapping fields issue in the past, in particular because they strongly discourage interpretting the formatted fields but at the same time do not provide raw versions of some of the fields we might be interested in.  For example I can only get SubjectDistance as a formatted field but FocalLength can be provided as a raw field.

                   

                  Raising these as conceptual issues/bugs means more thinking is required before the problem can be addressed.  I'd rather give a more proscriptive feature request that makes it clear which fields we want to see added to the ...RawMetadata functions.  Means it should be easier/quicker for the team to update the SDK and provide the features we want.  If you can mention any of the specific fields you want to see added to ...RawMetadata functions I'll add them to my feature request.

                   

                  I suspect this issue is more due to organic growth of the SDK than a specific design principle.  If we point out where this has gone astray it will likely get fixed, just a matter of when.

                   

                   

                  johnrellis wrote:

                   

                  Also, note that passing "nil" does not retrieve all the fields, contrary to the documentation. You need to pass an explicit list to be sure to get the fields you want.

                   

                   

                  I think you have already mentioned the fields that fall afoul of this particular issue.  A bug request seems warranted for this.  I'll add one as well.

                   

                  Matt

                  • 6. Re: catalog:batchGetRawMetadata ()
                    JW Stephenson Level 4

                    Hi Matt,

                     

                    I think the solution is to make one of the calls all inclusive (getFormattedMetadata is the closest) and then have the other two match them field for field (the other two being getRawMetadata and setRawMetadata).  Some exceptions of course for items like Keywords and Plug-in data for writing.

                     

                    Given that is maybe too large in scope, I would settle for a few fields I get from the Raw call that should already be in the Formatted call.  Namely the one's I keep acquiring are:

                     

                    lastEditTime: (number) The date and time of the last edit to this photo (seconds since midnight GMT January 1, 2001).

                    editCount: (number) Counter for edits on this photo. (Warning: This is not an absolute counter)

                    path: (string) The current path to the photo file if available; otherwise, the last known path to the file.

                     

                    The second thing we need is exposure of a couple more fields, namely those relating to metadata update status.  The only current way to check involves a SQL command outside of LR.

                     

                    Finally, for completeness, the other fields (not a critical for me but may be for others) that need to be added to make getFormattedMetadata a complete set of the currently exposed fields (so at least one call could be made for acquiring all data) would be:

                     

                    countVirtualCopies: (number) The number of virtual copies of this photo. Zero if this photo is itself a virtual copy.

                    virtualCopies: (array of LrPhoto) All virtual copies of this photo.

                    masterPhoto: (LrPhoto) The master photo from which this virtual copy is derived.

                    isVirtualCopy: (Boolean) True if this photo is a virtual copy of another photo.

                    countStackInFolderMembers: (number) The number of the members of the stack that this photo is in.

                    stackInFolderMembers: (array of LrPhoto) All members of the stack that this photo is in.

                    isInStackInFolder: (Boolean) True if the photo is in a stack.

                    stackInFolderIsCollapsed: (Boolean) True if the stack containing this photo is collapsed.

                    stackPositionInFolder: (string) The position of this photo in the stack. The top of the stack is at position 1

                    topOfStackInFolderContainingPhoto: (LrPhoto) The parent photo of the stack containing this photo

                    uuid: (string) Persistent ID for this photo

                     

                    There are other fields in getRawMetadata that are not fully represented in getFormattedMetadata (like width and length) but they can generally be derived from existing fields and thus not "necessary" in my opinion.

                     

                    Given you intent to file or edit a feature request on this matter, I hope this provides some insight.  As you suggested, adding a few fields to the getFormattedMetadata is probably the quickest fix.  If you would like me to file the report I would be happy to, but given you status it might get more attention coming from you

                     

                    Jeff

                    • 7. Re: catalog:batchGetRawMetadata ()
                      DawMatt Level 3

                      Hi Jeff,

                       

                      Thanks for taking the time to write that up!  Very detailed and will help flesh out the feature request I was going to raise anyway.  I'll definitely be lodging this a feature request covering this myself but feel free to raise one as well if you want.  Its a bit unclear whether this type of "voting" improves the chances of a feature being delivered.

                       

                      Matt

                      • 8. Re: catalog:batchGetRawMetadata ()
                        areohbee Level 5

                        DawMatt wrote:

                         

                        Its a bit unclear whether this type of "voting" improves the chances of a feature being delivered.

                         

                         

                        Sounds like a good question to ask Adobe directly, now that you're on "the inside" ;-}

                         

                        Then, if its not violating your NDA, share what they say with the rest of us. I'd sure like to stop repeating things if its doing more harm than good, or just isnt helping...

                         

                        Rob

                        • 9. Re: catalog:batchGetRawMetadata ()
                          DFBurns Level 1

                          Sounds like the ideal case might be one function named getMetadata but have it take a boolean parameter for whether the returned metadata should be formatted nicely or not (if that notion is meaningful for a given piece of metadata). The implication would then be that the set of metatags that was available is the entire set whether or not "nice" formatting was possible.