8 Replies Latest reply on Dec 1, 2014 1:29 PM by Dan G. Switzer, II

    Apache POST flex2gateway never closes or times out, reaches max child processes

    GuitsBoy

      We have been trying to pass an external PCI scan, and noticed some server lockups after starting a scan.  We are scanning a couple hundred IP addresses, which all resolve to the same servers.  The scans are actively looking for vulnerabilities on the box, and one of which is flash remoting.  When we look at the apache /server-status page, it shows a ton of long running flex2gateway processes.  For instance:

       

      22-44466

      0/3817/3817

      W4.0716384000.057.7657.76x.x.x.101WebNode2.ambassador.intPOST /flex2gateway/http HTTP/1.1

       

      As you can see, this POST request has been running for 163840 seconds, or nearly two days.  Since it seems these POST requests never complete, even though the client has long since disconnected, they simply stack up until the server's max number of child processes has been reached, effectively killing our webserver.

       

      When I try to restart the clustered coldfusion instances one at a time, these POST requests do not die off.

      If I stop both clustered CF instances, the requests complete (or get killed).

      If I reload or restart apache, the requests are gone as well.

       

      strace gives me nothing useful:

      [root@WebNode1 ~]# strace -p 34025

      Process 34025 attached - interrupt to quit

      read(185,

       

      pstack gives a little more, but nothing that looks obvious to me:

      [root@WebNode1 ~]# pstack -p 34025     

      Usage: pstack <process-id>

      [root@WebNode1 ~]# pstack 34025  

      #0  0x00007fdd40444740 in __read_nocancel () from /lib64/libpthread.so.0

      #1  0x00007fdd33efe2e6 in jk_tcp_socket_recvfull () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

      #2  0x00007fdd33f1b68d in ajp_connection_tcp_get_message () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

      #3  0x00007fdd33f1ceea in ajp_get_reply () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

      #4  0x00007fdd33f20308 in ajp_service () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

      #5  0x00007fdd33ef8f5d in jk_handler () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

      #6  0x00007fdd41b92cd0 in ap_run_handler ()

      #7  0x00007fdd41b9658e in ap_invoke_handler ()

      #8  0x00007fdd41ba1c50 in ap_process_request ()

      #9  0x00007fdd41b9eac8 in ?? ()

      #10 0x00007fdd41b9a7d8 in ap_run_process_connection ()

      #11 0x00007fdd41ba6ad7 in ?? ()

      #12 0x00007fdd41ba6dea in ?? ()

      #13 0x00007fdd41ba7a6c in ap_mpm_run ()

      #14 0x00007fdd41b7e9b0 in main ()

       

      I dont know what that tells us exactly, but I'm leaning toward the hangup between apache and tomcat. 

       

      Any suggestions on where how to troubleshoot this issue?

        • 1. Re: Apache POST flex2gateway never closes or times out, reaches max child processes
          GuitsBoy Level 1

          I removed clustering by editing the uriworkermap.properties file and pointing /flex2gateway and /flex2gateway/* to a single instance, and then ran the PCI scan again.  It still seems to hang.  I'm surprised there no other complaints about this out there on the interwebs.  I cant be the only one.

          • 2. Re: Apache POST flex2gateway never closes or times out, reaches max child processes
            GuitsBoy Level 1

            OK, I did a little more testing from a linux CLI using curl, and I find that if I post to /flex2gateway/<any string> it will hang indefinitely.  A normal get request results in a 404, but a post will hang it indefinitely.  Whats more, posting to just /flex2gateway/ seems to perform normally (some kind of binary data connection).  Its only if I put something in the path after /flex2gateway/ that it hangs indefinitely.  It performs the same if I hit one instance specifically, as opposed to through the cluster, so that eliminates apache as the problem.  I also notice a hang when posting to /flex-internal/ and /flex-internal/<some string>

             

            Any clue as to why this might act this way?

            • 3. Re: Apache POST flex2gateway never closes or times out, reaches max child processes
              GuitsBoy Level 1

              On a test server, I have removed the wildcard from the uriworkermap.properties file, so it now only matches "/flex2gateway" and "/flex2gateway/".  Unfortunately I'm still seeing the occasional hung apache worker. 


              Anyone have any leads on this issue?  I don't mind doing the research, I'v just exhausted the limits of my Google Fu.


               

              Apache Server Status for 10.10.10.205

              Server Version: Apache/2.2.15 (Unix) DAV/2 PHP/5.3.3 mod_ssl/2.2.15 OpenSSL/1.0.1e-fips mod_wsgi/3.2 Python/2.6.6 mod_jk/1.2.32 mod_perl/2.0.4 Perl/v5.10.1
              Server Built: Oct 16 2014 14:48:21

              Current Time: Monday, 10-Nov-2014 16:49:22 EST
              Restart Time: Monday, 10-Nov-2014 15:25:16 EST
              Parent Server Generation: 0
              Server uptime: 1 hour 24 minutes 6 seconds
              Total accesses: 5313 - Total Traffic: 98.4 MB
              CPU Usage: u3.97 s1.26 cu0 cs0 - .104% CPU load
              1.05 requests/sec - 20.0 kB/second - 19.0 kB/request
              15 requests currently being processed, 11 idle workers
              WWWWWWW_W_W_W__W__W__WW_W_...................................... ................................................................ ................................................................ ................................................................ 

              Scoreboard Key:
              "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
              "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
              "C" Closing connection, "L" Logging, "G" Gracefully finishing,
              "I" Idle cleanup of worker, "." Open slot with no current process

               

              SrvPIDAccMCPUSSReqConnChildSlotClientVHostRequest
              0-087270/12/12W0.03457200.00.050.0510.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
              1-087280/11/11W0.03435800.00.180.1810.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
              2-087290/38/38W0.04391000.01.111.1110.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
              3-087300/27/27W0.03406400.00.790.7910.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
              4-087310/16/16W0.03435400.00.120.1210.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
              5-087320/7/7W0.02456400.00.020.0210.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
              6-087330/8/8W0.02467300.00.010.0110.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
              7-087340/386/386_0.37400.06.496.4910.10.2.212www.company.qcGET /marketingpages/images/login_over.jpg HTTP/1.1
              8-094220/10/10W0.02456400.00.040.0410.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
              9-0101120/393/393_0.37600.014.5914.5910.10.2.212www.company.qcGET /marketingpages/images/box_onesource.jpg HTTP/1.1
              10-0104680/321/321W0.3284600.04.424.4210.10.2.212qc.company.intPOST /flex2gateway HTTP/1.1
              11-0104700/398/398_0.38600.012.8012.8010.10.2.212www.company.qcGET /marketingpages/images/home_eco.jpg HTTP/1.1
              12-0104710/340/340W0.3283700.04.994.9910.10.2.212qc.company.intPOST /flex2gateway/ HTTP/1.1
              13-0105440/404/404_0.34600.05.215.2110.10.2.212www.company.qcGET /marketingpages/images/box_top.jpg HTTP/1.1
              14-0105920/353/353_0.406120.014.1014.1010.10.2.212www.company.qcGET /?login HTTP/1.1
              15-0106480/296/296W0.3180000.03.823.8210.10.2.212qc.company.intPOST /flex2gateway/ HTTP/1.1
              16-0123820/339/339_0.33600.02.852.8510.10.2.212www.company.qcGET /marketingpages/images/logo_sourceone.jpg HTTP/1.1
              17-0123870/336/336_0.34600.05.065.0610.10.2.212www.company.qcGET /marketingpages/images/logo_onesource.jpg HTTP/1.1
              18-0123880/265/265W0.2583900.02.872.8710.10.2.212qc.company.intPOST /flex2gateway/ HTTP/1.1
              19-0123890/323/323_0.31000.04.824.8210.10.2.212www.company.qcGET /marketingpages/lib/dimming.js HTTP/1.1
              20-0123900/336/336_0.31400.05.245.2410.10.2.212www.company.qcGET /marketingpages/lib/superfish.js HTTP/1.1
              21-0123910/289/289W0.2780500.02.492.4910.10.2.212qc.company.intPOST /flex2gateway/ HTTP/1.1
              22-0123920/281/281W0.2783100.03.173.1710.10.2.212qc.company.intPOST /flex2gateway HTTP/1.1
              23-0147500/41/41_0.04600.00.920.9210.10.2.212www.company.qcGET /marketingpages/images/close.jpg HTTP/1.1
              24-0147510/43/43W0.04000.01.211.2110.10.2.36qc.company.intGET /server-status HTTP/1.1
              25-0147520/40/40_0.04600.00.960.9610.10.2.212www.company.qcGET /marketingpages/images/box_sourceone.jpg HTTP/1.1
              • 4. Re: Apache POST flex2gateway never closes or times out, reaches max child processes
                Dan G. Switzer, II Level 1

                Make sure you have the following in one of your config files:

                 

                # enable Flex Gateway

                <IfModule jk_module>

                    JkMount /*.cfm ajp13

                    JkMount /*.cfc ajp13

                    JkMount /*.do ajp13

                    JkMount /*.jsp ajp13

                    JkMount /*.cfchart ajp13

                    JkMount /*.cfres ajp13

                    JkMount /*.cfm/* ajp13

                    JkMount /*.cfml/* ajp13

                    JkMountCopy all

                </IfModule>

                 

                If you add this to the end of the mod_jk.conf file, just be careful when updating your connector in the future, because it may remove the lines. These commands are required to get the flex2gateway working in CF10. Without these lines, we've seen the exact same behavior you're describing.

                 

                Hope this helps!

                • 5. Re: Apache POST flex2gateway never closes or times out, reaches max child processes
                  GuitsBoy Level 1

                  Thanks for the response.  Where exactly did you need to add this block of code?  I tried adding it to the end of the mod_jk.conf file, as well as adding it to the default virtual host block in the httpd.conf files.  Neither seems to have helped when testing.  Thanks.

                  • 6. Re: Apache POST flex2gateway never closes or times out, reaches max child processes
                    Dan G. Switzer, II Level 1

                    We have it in our mod_jk.conf file, but be careful when updating the connector because it may remove the code.

                     

                    Make sure you've restarted Apache/ColdFusion after adding the lines as well.

                     

                    You might want to return your uriworkermap.properties back to it's original version.

                     

                    Here's the thread where I originally found the entries that needed to be added:

                    Re: Coldfusion 10 + Apache + Flex2gateway + Debian/Linux

                     

                    Maybe you can find more info from someone in that post.

                    • 7. Re: Apache POST flex2gateway never closes or times out, reaches max child processes
                      GuitsBoy Level 1

                      Thanks Dan, but I think we're talking about different issues.  We are well past the 404 problem.  This was solved by an alternate fix:  Adding the following code to the uriworkermap.properties file:

                       

                      /flex2gateway/* = CFCluster

                      /flex2gateway = CFCluster

                       

                      My problem is not an issue getting flex2gateway working - it works just fine.  The problem we see come up primarily during a PCI scan, when the scan attempts to post data to "http://10.x.x.x/flex2gateway/http" and the worker hangs indefinitely.  I can recreate the issue using curl like so:  curl --data "param1=value1&param2=value2" http://10.x.x.x/flex2gateway/http

                       

                      I have no such issue if I post to http://10.x.x.x/flex2gateway/ without the /http path.

                       

                      I have gotten around this problem by denying access to the /flex2gateway/http and /flex2gateway/httpsecure directories in the apache config, since these path are not used, nor are they even found.

                       

                      <Location /flex2gateway/http>

                          Order deny,allow

                          Deny from all

                      </Location>

                      <Location /flex2gateway/httpsecure>

                          Order deny,allow

                          Deny from all

                      </Location>

                      • 8. Re: Apache POST flex2gateway never closes or times out, reaches max child processes
                        Dan G. Switzer, II Level 1

                        Gotcha.

                         

                        I think the problem is similar to what you see when Flex isn't configured properly. It would appear Apache is handing off the request and then waiting for ColdFusion to respond, but it doesn't know how to handle the resource. I wonder if there's something in the web.xml that needs to be updated as well so that CF knows how to handle the /flex2gateway/http and /flex2gateway/httpsecure URIs.