Skip navigation
derision
Currently Being Moderated

Coldfusion 10 enterprise - mod_jk keeps crashing

Jul 20, 2013 4:09 PM

Tags: #windows #apache #tomcat #mod_jk #cf10

Environment:

Windows server 2003

CF10 enterprise - update 11 installed

Apache 2.2 with mod_ssl

 

Error

 

After about 8 hours of use eventually the logs will start showing this message and no further requests to coldfusion are possible. Static assets are still served by apache however.

 

[Sun Jul 21 01:00:11 2013] [1796:1092] [error] ajp_service::jk_ajp_common.c (2470): Failed allocating AJP message buffer

[Sun Jul 21 01:00:11 2013] [1796:1092] [info] jk_handler::mod_jk.c (2748): Service error=-5 for worker=cfusion

[Sun Jul 21 01:01:36 2013] [1796:3492] [error] ajp_service::jk_ajp_common.c (2455): Failed allocating AJP message buffer

[Sun Jul 21 01:01:36 2013] [1796:3492] [info] jk_handler::mod_jk.c (2748): Service error=-5 for worker=cfusion

[Sun Jul 21 01:01:37 2013] [1796:2488] [error] ajp_service::jk_ajp_common.c (2455): Failed allocating AJP message buffer

[Sun Jul 21 01:01:37 2013] [1796:2488] [info] jk_handler::mod_jk.c (2748): Service error=-5 for worker=cfusion

 

Steps taken to try and address

 

  • Have reinstalled both apache and coldfusion.
  • Have reinstalled the apache connector after fully patching coldfusion
  • Have attempted to tune the workers.properties / server.xml files (e.g http://www.webtrenches.com/post.cfm/resolve-stability-problems-and-spe ed-up-coldfusion-10)
  • Have tried both multiple instances and single instance mode
  • Have even attempted to get mod_proxy_ajp working, but because this server hosts approximately 20 websites I would need to be able to set up virtual hosts in the web.xml, however for some reason coldfusions version of tomcat this is not working as expected so I get 404's on all index.cfm requests when browsing with a virtual host proxying via the ajp port. Same goes from straight mod_proxying.

 

I think I am out of ideas short of swtiching to IIS. But have reservations based on the fact CF10 had workers problems with IIS as well and this is win2k3 meaning the horrible iis6 and no native rewrite rules.

 
Replies
  • Currently Being Moderated
    Jul 20, 2013 6:22 PM   in reply to derision

    I don't know what is going wrong for you there and CF with apache is not my environment however some things come to mind you can try.

    Are those errors reported in "mod_jk" log (cf10\config\wsconfig\N\) file? Any java errors in "cf-out" or "cf-error" logs (cf10\cfusion\logs)? Reason I ask is tomcat connector is reporting a buffer issue, tomcat uses java and sometimes one of the java memory buffers (heap new perm) could be reaching a limit.

     

    Something else that might be worth trying is add tomcat native library to use tomcat AJP apr rather than AJP bio. Weekend for me so if you want some more details on that reply and I will respond when I am at work.

     

    I favour applying similar values as those listed in the link you mention. One point of contention with those values is the worker.cfusion.connection_pool_timeout=60 in mod_jk properties and connectionTimeout= "60000" in server.xml. Some references suggest a value of 10 minutes not 1 minute so that would be pool_timeout=600 and connectionTimeout= "600000". Note you need to restart CF and apache to apply those tomcat changes.

     

    HTH, Carl.

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 21, 2013 2:42 AM   in reply to derision

    With Permanent Generation it can be a good idea to set an initial setting. Windows 2003 32 bit I guess? Suggest use -XX:PermSize=192m -XX:MaxPermSize=256

     

    Something else I neglected to mention about tomcat settings. Like java sometimes the initial or minimum settings are more important than maximum. To that end in mod_jk properties and server.xml and thus CF stability and performance can benifit from:
    worker.cfusion.connection_pool_minsize="max"/2 (half maximum setting)
    minSpareThreads=""max"/2" (ditto half maximum)

     

    Regards, Carl

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 21, 2013 4:04 PM   in reply to derision

    First beg my pardon I notice a typo in earlier post, should read:

    Suggest use -XX:PermSize=192m -XX:MaxPermSize=256m

     

    Regarding - Unless specified maxthreads=200. What I have suggested is to increase minsparethreads since you can realise a default setting of 10 can be not very many threads waiting in pool and forces tomcat to not reach it's potential always removing threads from memory to reduce it's foot print.

     

    Here is the references for tomcat 7, CF10 using a customised 7 not 6:

    http://tomcat.apache.org/tomcat-7.0-doc/config/ajp.html

    http://tomcat.apache.org/connectors-doc/reference/workers.html

     

    Keep in mind while I have many CF10 production servers they are all IIS and my CF10 apache knowledge is little more that setup configure just to see what it looks like.

    To apply a minimum thread setting on IIS tomcat properties file could look similar to EG:

     

    worker.list=cfusion

    worker.cfusion.type=ajp13

    worker.cfusion.host=localhost

    worker.cfusion.port=8012

    worker.cfusion.max_reuse_connections=250

    worker.cfusion.connection_pool_size = 400

    worker.cfusion.connection_pool_minsize= 200

    worker.cfusion.connection_pool_timeout = 600

     

    Correspondingly server.xml AJP portion would have this syntax EG:

     

    <Connector port="8012" protocol="AJP/1.3"

    redirectPort="8445"

    tomcatAuthentication="false"

    maxThreads="400"

    minSpareThreads="200"

    connectionTimeout="600000" />

     

    Something important to mention because I lack enough apache. The tomcat connector and configuration references have some red apache caveats. Applying thread changes that have worked well in an IIS may not suit apache.

     

    You mention your familiar with JVM tuning monitoring. If you are savvy with using JVM tool jconsole (and lesser so jvisualvm) you could use that to monitor graphically what is happening with the tomcat threads. A bit hard to quickly write that up tho if you are interested I did a talk and demo on that and can find the link to recording.

     

    I will post some tomcat native details soon tho I discuss that as well in above mentioned talk.

     

    Regards, Carl.

     

     

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 21, 2013 5:22 PM   in reply to derision

    Re promised tomcat native library details.

     

    What you could try here is to see if APR relieves the problem. Now don't get me wrong I am just throwing idea's out there for you to try and could be going in the wrong direction. The way I see it is there are not many other log details to go on and you are tackling the issue via1)tomcat adjustments 2)java parameters and I have also suggested 3)monitoring. APR is a tomcat alteration.

     

    When CF starts CF10\cfusion\logs\coldfusion-error.log reports this output EG:

     

    org.apache.catalina.core.AprLifecycleListener init

    INFO: The APR based Apache Tomcat Native library which allows optimal

    performance in production environments was not found on the java.library.path:

    d:\\ColdFusion10\\cfusion\lib;d:\\ColdFusion10\\cfusion\jintegra\bin;d :\\ColdFusio

    n10\\cfusion\jintegra\bin\international;d:\\ColdFusion10\\cfusion\lib\ oosdk\classe

    s\win

    org.apache.coyote.AbstractProtocol init

    INFO: Initializing ProtocolHandler ["ajp-bio-8012"]

    org.apache.catalina.core.StandardService startInternal

    INFO: Starting service Catalina

    org.apache.catalina.core.StandardEngine startInternal

    INFO: Starting Servlet Engine: Apache Tomcat/7.0.23

    etc

     

    Here is the APR documentation

    http://tomcat.apache.org/tomcat-7.0-doc/apr.html

     

    Download the Windows binary distribution here (or other mirror)

    http://apache.mirror.uber.com.au/tomcat/tomcat-connectors/native/1.1.2 7/binaries/

     

    The downloaded ZIP contains two tcnative-1.dll files for 32 and 64 bit. Depending on your Windows 2003 bit copy appropriate tcnative-1.dll to CF10\cfusion\lib . Now when you restart CF the coldfusion-error.log reports:

     

    org.apache.catalina.core.AprLifecycleListener init

    INFO: Loaded APR based Apache Tomcat Native library version.

    org.apache.catalina.core.AprLifecycleListener init

    INFO: APR capabilities: IPv6 [true], sendfile [true], accept filters [false],

    random [true].

    org.apache.coyote.AbstractProtocol init

    INFO: Initializing ProtocolHandler ["ajp-apr-8012"]

    org.apache.catalina.core.StandardService startInternal

    INFO: Starting service Catalina

    org.apache.catalina.core.StandardEngine startInternal

    INFO: Starting Servlet Engine: Apache Tomcat/7.0.23

    etc

     

    Perhaps with tomcat AJP switched to use APR you might find successful outcome or even different log details which in turn might lead to a solution.

     

    HTH, Carl.

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 21, 2013 6:07 PM   in reply to derision

    As you know from jvisualvm once CF java has JMX parameters applied and jconsole connects, select MBeans (managed beans) tab open Catalina (aka tomcat) - threadpool - ajp-bio-port (or ajp-apr-port depending if native library installed) – attributes then you can open thread count busy timeout to see what is happening with tomcat connector at the CF end.

    EG:

    Capture.JPG

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 22, 2013 7:05 PM   in reply to derision

    Thanks for your observations. Well indeed that would appear not overly stressed at all.

     

    Using RDP or console Windows TASKMANGER | Processes tab, how much memory Commit Size (you might have to add that column and tick show process for all users) does the coldfusion.exe grow to?

     

    Also might be interesting to know is CF10 using java 6 or 7 (CFadmin > System Information | Java Version and Java VM Name)?

     

    I hope the native tomcat library (APR) helps. Carl.

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 23, 2013 4:22 PM   in reply to derision

    Painful I agree. At least the error details are consistent tho do not know yet what the errors mean.

     

    Heap is larger than 1.4Gb so I guess Win 03 and CF10 64 bit.

     

    With monitoring the AJP threads double click the currentthreadcount currentthreadbusy and keepalivetimeout numbers (10, 0 & 0) to get a graph. There could still be something going on there unnoticed since those details would not be updating unless you were to keep pressing refresh button. 

     

    Regards, Carl.

     

    PS

     

    Further CPU usage drops off at 6pm when crash occurred yet jconsole has not lost connection to CF so java would still seem to be responding something connector or tomcat wise has terminated.


    The way I read the error is the same as before trying tomcat native library?


    Interesting cf-error log commences having issues 20 minutes before crash. Are those errors new details since a change was applied as I read before no errors were occurring except what noticed in mod_jk log?

     

    PPS

    This detail (-XX:PermSize=192m -XX:MaxPermSize=256m) mentioned earlier applies to 32 bit it would perhaps not suit 64 bit tho I don't think dealing with a java perm problem here. Just thought I should say for interested readers.

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 23, 2013 6:22 PM   in reply to derision

    So altering to AJP to APR has not stopped crash but has given some extra log details. While that does not solve immediate problem it does provide some more leads to follow.

     

    Threads monitoring – that is where turning that in to a graph can help because you can see history if you miss pressing refresh.

     

    Regards, Carl.

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 23, 2013 11:13 PM   in reply to derision

    Something else - does CF10\cfusion\runtime\logs localhost_access_log provide any useful information approx 20 minutes before and during outage?

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 24, 2013 3:28 PM   in reply to derision

    The AJP graph will be interesting from a technical view point tho it will not stop the outage.

     

    Railo hard for me to say. I know IIS8/W2k12/CF10 works very well for me with some connector modifications applied.

     

    Regards, Carl.

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points