Best to get an idea of what the CF JVM is doing. Can do that many ways via JVM logging , JDK tools jconsole and jvisualvm , others CF Monitor (CF8 9 10), CF Server Manager (CF9 10) both in a limited way, CF Jrun metric (CF7 8 9) CF tomcat metrics (CF10) and perhaps FR and seeFusion have some tools. I think in your case JVM logging would be best to analyse what is happening then knowing what is occurring in CF JVM apply a change and monitor again. How to enable JVM logging and tools to help with reading or understand the log latter.
CF version (suspect 9.0.n but you do not say) and Edition?
Java version that CF is using eg 1.6.0_24?
Operating System and CF and Java are 64 bit?
Probably no bearing Windows Linux? IIS Apache?
Sample of log error message that shows the heap has problem?
Add these without return line feeds to your JVM args. Copy or backup your JVM.CONFIG before applying change. CF needs to restart to apply changes.
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc -Xloggc:cfjvmGC.log
Creates a log file in ColdFusion\runtime\bin\cfjvmgc.log
Jrun4\bin\ in case multiserver
Use GCViewer tool to graphically examine the "cfjvmgc.log" contents:
Would not like to suggest a change to make without some details on the actual error and a look at the JVM logs.
Once have some details then there are some things that come to mind:
-Setting initial size same as maximum can lead to a fragmented heap tho might be ok
-Could be non heap memory value that is filling eg Perm or Code Cache
-Set New gen value (eg -Xmn184m) so JVM does not make poor guess on size
-Garbage Collect every 10 minutes OK, so you are trying to keep heap evacuated for now
-Could try a different GC routine other than UseParallelGC
Amendments and addition to earlier post that apply to CF10.
JVM log file "cfjvmgc.log" will be in ColdFusion10\cfusion\bin or ColdFusion10\"instance"\bin case Enterprise Manager > Instance Manager > added new instance.
This ServerStats could help resolve JVM heap issues. No need to restart CF10 to apply JVM logging or JDK style Jconsole tools to get a look at memory heap and CPU usage. Ref:
Hope that’s helpful for readers, Carl.
Hi carl type3,
I have updated my JVM from version 1.6.0_17 to 1.6.0_24. After updating the JVM there no "Heap Out of Memory Issue" from 2 days. But, it seems all my long running requests are getting timed out either in CFLOOP or in CFQUERY though I have set a high timed out value for page request and CFQUERY.
In my local development environment the pages working fine without timeout request, but giving time out error in staging server.
Following is my Server Configuration:
ColdFusion Version: 9,0,1,274733
Operating System: Windows Server 2008 R2
Adobe Driver Version: 4.0 (Build 0005)
Java VM Name: Java HotSpot(TM) 64-Bit Server VM
I was monitoring my Application log I found some error messages like
"java.lang.OutOfMemoryError: GC overhead limit exceeded".
"java.lang.OutOfMemoryError: Java heap space at org.apache.xerces.dom.DeferredDocumentImpl.createChunk(Unknown Source)"
Is it due to Garbage Collector?
Too much time is being spent in Garbage Collection. Could be because of heap (Xms Xmx), non heap (PermSize MaxPermSize) or garbage collector routine (UseParallelGC) suitability for the work load. The warning can be disabled by adding the option -XX:-UseGCOverheadLimit to JVM args however I would prefer to fix the problem, which will be causing some slow application response, rather than simply turn off the warning.
Do not have enough details to make a recommendation on what JVM arg setting to alter since the frequent GC's that are not releasing memory might be due to multiple issues. JVM logs if enabled and details analysed could assist to find a solution. If suspect matter is heap related then you could apply a change to set the New generation space, which is part of heap (heap = Old + New where New = Eden + 2 Survivor spaces), then JVM args would look like eg 1. If suspect Permanent generation was not big enough then eg 2. If suspect GC routine suitability another set of JVM args.
java.args=-Duser.timezone=America/Chicago -server -Xmx2048m -Xms2048m -Xmn184m -Dsun.io.useCanonCaches=false ...etc
java.args=-Duser.timezone=America/Chicago -server ...etc -XX:PermSize=256m -XX:MaxPermSize=512m ...etc
As I recall there was a problem (leak) with UseParallelGC with Java 1.6.0_17 that was fixed in 1.6.0_21 (?) onwards so perhaps no surprise your getting better uptime with Java update, tho with GCOverheadLimit happening you are not far from heap full problem.
You have Enterprise licence. Are you able to get any useful diagnosis from running CF Monitor?
I’d propose that the frequent “GC overhead” errors are more simply just a reflection that something is holding memory (in the CF heap). When the JVM repeatedly tries to do a GC and cannot recover much (in a couple of minutes), then it throws this message. The solution is to find what’s holding memory.
As Carl noted earlier in the thread, there are many ways to attack this, but I would propose that JVM tools are not the answer. The simpler question is “what in CF could be holding memory for extended periods, that may be in your case”, Upen. Such things most often are either excessive use of session variables, application variables, server variables, and/or query caching. And all these can be caused to be used all the more by large amounts of spiders/bots and other automated requests being made to the CF server, which causes CF to create a new session on each page request (from such an automated request) rather than “once per session” as would be the case from more typical browsers. Too much to explain here, Upen, but I’ve discussed it before a:t
Hope that helps.
Yes, I was creating some long running cfthreads in my application. Although I had some logic to limit the no of thread creation but I was not killing any hang thread.
Inside those threads I was getting data from third party server in batches and the data size sometimes very large like 56k no of images or 50MB text data (with more than 35k records in that text file). So, I guess when one thread hangs it reserves all resources which is previously occupied(e.g - memory) and it was not releasing those till it was alive. It was creating trouble for me.
Now, I am identifying those long running threads (threads running for more than some time limit(let say 2 hr)) then killing those threads.
It seems everything working fine for now.
Carl, I think the above mentioned thread creation was my problem. Thank you for sharing information regarding JVM management.
I read your post, it was really informative for me. Thanks for sharing. I will follow it from now.
1 person found this helpful
Thanks for the update. Personally, I’m never a fan of “killing threads”. The solution seems instead to make them stop taking so long. (And of course, this is going beyond the subject of “heap out of memory issue”.)
But you made a mention of images. Are you by any chance doing CFIMAGE action=”resize”, or imageresize(), or imageScaletoFit()? If so, any of these could be your culprit, especially if you’re processing many images that way. The bad news is that there’s a default that may be hurting performance. The good news is that there’s a simple fix.
Check out a blog entry I just created to explain the issue (with solutions):
I realize it may not be your problem, but let us know either way.
And as for finding out what IS holding up your long-running requests, I strongly recommend you consider a couple of other blog entries I’ve done, on both being misled by “timeouts” and on doing stack tracing in CF to know the exact line of code at a point in time in a long-running request:
Hope that helps.
Hi Charlie Arehart,
Sorry for the late reply. But I have went through your all posts and the video session in cfmeetup you mentioned. I found those very helpful.
Finally, I able to find out the root cause of thread hanging and CPU overhead. It was a session timeout of a third party server during communication process. I solved that issue.
Regarding thread killing: after reading your post and video session I am also felling that we should remove that thread kill logic. After a few testing I will do that if everything went fine.
Great to hear. Thanks for the update.