7 Replies Latest reply on Jun 4, 2014 11:48 AM by BKBK

    CF10 was working for months, now crashes. How do i troubleshoot cause?

    BrighamJudd

      We've had CF10 running for months now with very few problems. This Tuesday i start to hear of the CF instance crashing in the mornings. I restart service and it runs again for the day. Wednesday it is crashed again, so i restart service twice before it will start working. Thursday the service is dead again. all my efforts to restart service or reboot server only allow it to work for minutes at a time. it starts working fast as normal, then slows, then really lags, then Time Out errors from queries and then crashes the CF instance again. on and off like this all day and still today.

       

      I try tweaking some CF Admin settings but nothing seems to give a lasting help. Server seems to be running just fine (Windows server running IIS). I can get general HTML pages to work continually even during worst behavior of CF. I checked my SQL server as well and it is running fine, no strange or overwhelming queries. I rerouted traffic to the staging server instead of the Live Prod server and after a few minutes that server goes down too. It acts almost like a slow memory leak, but server monitoring on CF Admin shows steady memory usage, no dramatic increases.

       

      I am not a super knowledgeable server admin guy, mainly I'm a CF coder, but I am tasked with keeping this server running.

       

      How do I troubleshoot this?

      What should i be looking for?

      Any ideas what would cause this behavior and just suddenly start doing this now?

       

      Thanks for any input in advance.

        • 1. Re: CF10 was working for months, now crashes. How do i troubleshoot cause?
          carl type3 Level 4

          >How do I troubleshoot this?
          I think you have come to the right place. Can you provide more information Windows and CF10 are 32 or 64 bit? CF10 update level?

           

          >What should i be looking for?
          Details, errors or warnings in CF10\cfusion\logs coldfusion-out and coldfusion-error log can be helpful.

           

          >Any ideas what would cause this behavior and just suddenly start doing this now?
          Not yet

           

          HTH, Carl.

          • 2. Re: CF10 was working for months, now crashes. How do i troubleshoot cause?
            BrighamJudd Level 1

            Carl,

             

            Thanks for responding.

             

            Windows 2008 R2 Stadard server 64 bit OS running on a server with dual 2.4 GHZ processors and 4 GB of RAM.  CF10 is the ColdFusion version with Update 13 in place.

             

             

             

            Latency gets longer and longer after restarting the service until the ColdFuison becomes completely unresponsive (page can not be displayed). No errors generally, though occasionally today I have seen spikes of Time Out errors on cfqueries or cfloops. those errors come in all at once, a large spike of them 40 at a time sometimes. Then the service will pick up again and work but slow, and eventually stop responding again entirely. When the CF service is restarted it starts fresh again with fast response times for a few minutes but quickly degrades.

             

            I dont see anything glaring in the Logs, but i may not know what i am looking at. They are a bit large to post here. the ColdFuison-Error does have this message repeated every few seconds.

             

            "May 30, 2014 5:48:04 PM org.apache.tomcat.util.http.Cookies processCookieHeader

            INFO: Cookies: Invalid cookie. Value not a token or quoted value"

             

            but that has been a message for as far back as i can see. something worth fixing i suppose, but not likely the cause of this issue.

            • 3. Re: CF10 was working for months, now crashes. How do i troubleshoot cause?
              BKBK Adobe Community Professional & MVP

              BrighamJudd wrote:

               

              ... starts working fast as normal, then slows, then really lags, then Time Out errors from queries and then crashes the CF instance again. on and off like this all day...

              From your description you might have either of 2 problems:

               

              (1) Loops that run indefinitely, exhausting the CPU. You will have to find the offending page, or pages, and locate the code that causes the looping. A tool frequently used for this purpose is FusionReactor.

               

              (2) You upgraded from an older version of Coldfusion, and failed to delete all the web-server connectors of the older installation. Search the web and you will find, for example, a fix for slow IIS after upgrading from ColdFusion 8/9 to 10.

              • 4. Re: CF10 was working for months, now crashes. How do i troubleshoot cause?
                carl type3 Level 4

                No help from cf-error and cf-out logs.

                 

                CF10 update 13 - has a manual step to upgrade the CF10 tomcat to IIS connector been done by running CF10\cfusion\runtime\bin\wsconfig.exe and connector removed then added? If you have or have not the date stamp of CF10\config\wsconfig\1\isapi_redirect.dll would be interesting to know.


                Any errors or warnings in CF10\config\wsconfig\1\isapi_redirect.log?

                 

                HTH, Carl.

                • 5. Re: CF10 was working for months, now crashes. How do i troubleshoot cause?
                  tribule Level 2

                  Are the issues occuring at roughly the same? A scheduled task perhaps? If not, ensure the server is not under attack. We had similar issues and it turned out someone was launching an attack on us and calling thousands of pages every few seconds. Check your web logs leading up to the issues for strange entries. Also, consider installing FusionReactor, to see what pages have issues. It could provide a clue.

                  • 6. Re: CF10 was working for months, now crashes. How do i troubleshoot cause?
                    BrighamJudd Level 1

                    Thanks to all those that contributed.

                     

                    For anyone that might read this in the future, to answer some of the suggestions made above:

                    • No, there were no infinite loops. The error started just last week and there had not been any code changes recently.
                    • No, we had not upgraded from an older version of CF.
                    • This Tomcat to IIS connection may have been relevant, that is a good thing to question for the future.
                    • I never got to the isapi_redirect.log
                    • No, there were no scheduled tasks on the server.
                    • No, there was no unusual network traffic to indicate a denial of service attack or something similar.

                     

                    We engaged some internal server support people and two other suggestions came up.

                    1. The ColdFusion's use of Java virtual memory was questioned as a possible problem

                    2. Someone noted a previous problem with multiple server instances trying to run at once causing a conflict.

                    We also did notice that the CPU usage on the server was remaining constant at 50% usage and spiking to 100% at multiple times a minute. This was particularly noteworthy as we were seeing this during a low traffic period. We were unsure of the usual CPU usage on a CF server, but this did seem odd.

                     

                    With these suggestions in mind I elected to remove the installation of CF10 and reinstall a fresh copy. I saved a copy of Admin settings, uninstalled, removed the previous folder, and reinstalled. After reconfiguring the Admin settings and data source connections the server was running again better than ever. It is now three days past and the server is still running smooth with full traffic and even faster than before. Still not entirely sure what the exact cause was, but this reinstall was a relatively painless step to take so it was very worth the effort. Also the CPU usage is now closer to 10% with occasional spikes higher but nothing more than about 80%.

                     

                    Thank you so very much for helping me troubleshoot and resolve this issue.

                    • 7. Re: CF10 was working for months, now crashes. How do i troubleshoot cause?
                      BKBK Adobe Community Professional & MVP

                      Then please mark your answer as correct.