Hi,
I am considering a switch from Expression Media to Lightroom, and I wrote for that a Lightroom plugin that imports all my metadata from an EM XML file. Functionaly, it works great.
However, I have a performance issue problem: if I run the plugin on my whole photo collection (>30.000 items), then Lightroom gets crazy with the CPU and the memory after once my code returns.
The processing time seems exponential. If I run the plugin with 1.000 items, the return is almost instantaneous. On 5.000 items, it takes several minutes. With 30.000 items, it still wasn't finished after 30min, I killed Lightroom.
My guess is that Lightroom does some kind of post-processing, or maybe that's because Lightroom prepares the "undo".
Is that a known problem ?
Here is the skeleton of my code. Once doStuff2 returns, Lightroom does who knows what, and then doStuff returns too. This time seems exponential with the number of database changes my code causes.
function doStuff2(context, progress)
local catalog = LrApplication.activeCatalog()
local photos = catalog:getTargetPhotos()
parseXml()
for completed, photo in ipairs( photos ) do
if progress:isCanceled() then return end
photo:addKeyword(...)
setMetadata(...)
end
function doStuff()
local catalog = LrApplication.activeCatalog()
catalog:withProlongedWriteAccessDo(
{
title="Import from Expression Media",
func=doStuff2,
caption="Initializing plugin",
pluginName="Expression Media Import",
optionalMessage = "The plugin will first read your Expression Media XML file and then import the data into Lightroom for "
.. ( # catalog:getTargetPhotos() ) .. " photos"
})
end
import 'LrTasks'.startAsyncTask( doStuff )
Hi,
As Rob said the official way to choose the commit point is to close a "with-do", and then start a new one for the next batch of transactions. This technique is useful when batching large numbers of changes to the catalog (as you are) or when making changes that only become visible after a commit (e.g. creation collections or changing their contents).
Matt
While every database technology has various resource issues with long-running, voluminous transactions, I wonder in this case whether the problem might not be with the relational database (SQLite) but rather LR's undo facility.
A quick experiment: Try catalog:withPrivateWriteAccessDo() instead. From the SDK:
Provides write access to custom fields defined by your plug-in. Use this instead of withWriteAccessDo() if you are only modifying metadata for your plug-in and do not want to add the operation to the undo stack.
The documentation is ambiguous, not differentiating between the intended use and actual semantics. But I'd guess that the only difference between withWriteAccessDo() and withPrivateWriteAccessDo() is that the latter doesn't add all the catalog changes to the LR application undo stack.
If that quick change doesn't work, then I agree with Rob that breaking up the changes into smaller batches will give much better overall performance.
ocroquette wrote:
I was almost done, but now I have to refactor my code to process batch of 1.000 files instead of processing the 30000 files at one, and that's not a trivial task
It's a pisser alright, and although not trivial, it's not too difficult either.
I've been lobbying with Adobe to remedy this. It's one thing for it to take a long time, but a worse problem is that it can over-consume ram and hang the system.
I decided to build the incremental catalog updating into my core methods (not yet released):
http://forums.adobe.com/thread/1009149
Even splitting the update into chunks, Lightroom gets slower and slower as it goes. For me, it usually makes it through if I leave it run long enough, but it's a mystery why it gets sooooooo slow after a time...
Rob
I am done with my transfer now. I have posted my code here:
https://ocroquette.wordpress.com/2012/05/23/switching-from-expression- media-to-lightroom/
Thanks all for the great input.
About the performance, on my system, my plugin is able to process about 100 photos / second at the beginning , but at the end it's down to 40 photos / second, even working with small chunks. I used chunks of 1000 photos, but in the end I don't think it changes much the total time. It just allows to show the progress, which is a must, knowing that the whole process takes 10-15min for my 33.000 pictures.
I also realized that fetching the keywords from Lightroom is damn slow, I implemented a cache, that helped, but the cache has to be flushed after each transaction.
I think I also found a bug. The following code:
local keyword = catalog:createKeyword("Test" , { }, true, nil )
local children = keyword:getChildren()
Produces the followin error:
An internal error has occured bad argument #2 to 'format' (number expected, got string)
I have a workaround in my code for this bug, so it's not blocking me, but I would like to report it to Adobe. Do you know the best way to do so ? I am unsure.
And does it make sense to complain about the performance issues of big transactions ? If yes, where ?
Thanks again for your help. This thread has been very useful to me !
Olivier
The LrKeyword methods, along with most methods that access objects represented in the catalog, are very slow. But at least they got about 6 times faster in LR 4 compared to LR 3. LrPhoto has batch methods for getting metadata from many photos at once, but there's no equivalent for keywords. I think it's a big deficiency in the SDK.
Adobe wants all bugs and feature suggestions for Lightroom to be posted here:
http://feedback.photoshop.com/photoshop_family/products/photoshop_fami ly_photoshop_lightroom
Those few of us developers active here generally start the subject lines with "SDK: ".
Hi Olivier,
ocroquette wrote:
I think I also found a bug. The following code:
local keyword = catalog:createKeyword("Test" , { }, true, nil ) local children = keyword:getChildren()Produces the followin error:
An internal error has occured bad argument #2 to 'format' (number expected, got string)
It it possible you will need to close a transaction (i.e. a with...Do call) between the first and second command, as the second one is accessing a change to the catalog that might not have been committed yet. This could be causing the bug you are observing. Does your workaround involve something like this? If so it might not actually be a bug and instead be operating "as designed", but not documented clearly enough to be easily understood by those new to the SDK.
Matt
Very possibly.
I've been really stoked with my new multi-phase catalog accessor - strongly recommend writing something like it for yourselves (see source code in other thread, or contact me for fresh stuff).
Here's an example from my most recent plugin:
local s, m = cat:update( 10, "Assure Folder Collection Set", function( context, phase ) -- 10 is seconds to get in.
-- function can take parameters, but not used here.
if phase == 1 then
set = catalog:createCollectionSet( folder:getName(), parent, true )
return false -- keep going
elseif phase == 2 then
leafPhotos = folder:getPhotos(false)
if #leafPhotos > 0 then
return false
else
return true -- done
end
elseif phase == 3 then
coll = catalog:createCollection( str:fmt( '[^1]', folder:getName() ), set, true )
app:logVerbose( "Created/assured \"Folder Leaf Photos\" collection ^1 in '^2'", folder:getName(), set:getName() )
return false
elseif phase == 4 then
local collPhotos = coll:getPhotos()
if #collPhotos > 0 then
local collPhotoSet = tab:createSet( collPhotos )
for i, photo in ipairs( leafPhotos ) do
if not collPhotoSet[photo] then
toAdd[#toAdd + 1] = photo
end
end
else
toAdd = leafPhotos
end
if #toAdd > 0 then
coll:addPhotos( toAdd )
end
return true
end
end )
Note: each phase is a new transaction.
In your case Oliver:
local keyword
local children
local s, m = cat:update( 10, "Keyword business", function( context, phase )
if phase == 1 then
keyword = catalog:createKeyword( "Test", {}, true, nil )
return false -- continue to 2nd phase.
elseif phase == 2 then
children = keyword:getChildren()
-- do something to catalog with children
return true -- or nil.
end
end )
This would not make sense if nothing further involved catalog writes (in which case just exiting a simple single-phase with-do method would suffice), but you get the idea, no?
R
John R. Ellis wrote:
Adobe wants all bugs and feature suggestions for Lightroom to be posted here:
http://feedback.photoshop.com/photoshop_family/products/photoshop_fami ly_photoshop_lightroom
Those few of us developers active here generally start the subject lines with "SDK: ".
Thanks ! I have created a dedicate thread: http://feedback.photoshop.com/photoshop_family/topics/sdk_getchildren_ returns_an_exception_for_newly_created_lrkeyword
Matt Dawson wrote:
It it possible you will need to close a transaction (i.e. a with...Do call) between the first and second command, as the second one is accessing a change to the catalog that might not have been committed yet. This could be causing the bug you are observing. Does your workaround involve something like this? If so it might not actually be a bug and instead be operating "as designed", but not documented clearly enough to be easily understood by those new to the SDK.
Hi Matt,
yes, it works after the transaction is closed. My workaround is however to avoid calling getChildren() for newly created LrKeyword's and instead use the hard coded empty list. When new children are added, then getChildren() starts working.
I don't know if this is by design, but it must be fixed. I see no reason why getChildren() should not work until the transaction is closed, even from the perspective of the SDK developers. It just looks like a missing initialization of some sort for the new LrKeyword's object, which is fixed by a transaction or the addition of children.
Rob Cole wrote:
Very possibly.
I've been really stoked with my new multi-phase catalog accessor - strongly recommend writing something like it for yourselves (see source code in other thread, or contact me for fresh stuff).
Here's an example from my most recent plugin:
Thanks a lot for sharing your code Rob ! It's very interesting. As far as my case is concerned, I have now transitioned from Expression Media to Lightroom, so I stopped working on this. But your code may come in handy when I write more plugins !
North America
Europe, Middle East and Africa
Asia Pacific