I'm testing an async process that I've been running for months and months on CF8 32-bit on a new Windows 2008 64-bit server. It typically runs for about 4-5 hours, with two threads handling data manipulation on two different sets of data. I have logging built in to tell me how long each 500 items in each thread take to process, so I can see at any time where the processes are. The reason I'm testing it on 64-bit is because when this process runs alongside some other bulky ones, it was hitting the 1.5Gb JVM ceiling, so I wanted to give it its own instance with a 3Gb heap in 64-bit.
I've already gone through the drill of updating my JDK (to 1.6.20) per Charlie Arehart's suggestion. What I'm seeing is really weird. The process runs along fine for about two hours, which gets it about halfway through the total number of items to be processed, and the speed is on par or better than it is in 32-bit. Then all of a sudden, it screeches to a crawl, taking 4-5 times as long to process each chunk of items. What's weird is that up to that point, I can see the JVM memory usage in CF Server Monitor, and it's spiky but remaining consistently below the 3Gb ceiling. Right at the point that it slowed down, the memory levels off around 2.1Gb and stops spiking (image attached). I would have expected to see it level off at max memory and get slow, but if there's still almost a gig of memory free and nothing else is running on the server, why is it slowing down? I verified there are no issues on the database servers I'm using, and this has happened exactly the same way twice day, once with the original JDK from CF8.0.1 and once with the new one. I checked the JRUN logs, but there's nothing significant in there.
Does anybody have any suggestions for how to tackle this? Is it worth messing with 64-bit, or should I just create multiple 32-bit instances? Is CF9 any better at this than CF8?
Since the memory usage is flattening, could there be race conditions in your code? Does the batch slow down at the same point everytime, or is it random? I would start taking a look at all your variables, how and when they are used and that they are all properly scoped. It would help if you could post some code too.
I also see a slight problem with your memory profile image. The memory usage is constantly climbing. In long running processes like yours with lots of little units of work, your goal should be to see a more saw tooth profile. This shows that the GC is actually doing its job.
It seems that certain variables in CF are being held in memory during the entire lifetime of the batch. This could be because some variables have the wrong scope, or Cold Fusion and JAVA aren't doing what you expect with the varibles at the end of each step. Try clearing all the structures, arrays, lists, whatever.... etc at the end of each step.
If you do have to hold on to lots of varibles at the end of each step, then they seem to be quite large. Perhaps it would be best to offload these to the filesystem and then retrieve them at the end?
Hope this gives you some ideas.
Europe, Middle East and Africa