8 Replies Latest reply on May 18, 2007 7:30 PM by bradwood.com

    cfdocument hanges on slow images

    bradwood.com
      Here's the deal, we have a CF web service which generates PDF files. You pass it HTML and the path to a file server and it writes a PDF there. Pretty simple. We have a Java app which generates appraisals on the fly and uses the CF web service to make a PDF.

      The problem: Once or twice a day a request would just hang. CPU usage would stay low, but the thread would just sit there. This was on CF 7.0.2 Standard which means cfdocument was single threaded and all other cfdocuments on the server would get in line and wait.
      We now have CF 7.0.2 Enterprise which multi-threads cfdocument, but it hasn't solved the initial problem.
      These hung threads will run for hours EVEN THOUH the timeout is 100 seconds. Furthermore, I CANNOT kill the thread with SeeFusion. I have to restart CF.

      So. I started digging. I would observe stack traces for the hung threads. Every time the thread hung it would be fetching images in the HTML. I knew this because the top of the thread stack always looks like this:

      "jrpp-80" waiting for monitor entry
      - waiting on <147> a com.icesoft.util.trackers.RenderDoneTracker)
      at com.icesoft.util.trackers.RenderDoneTracker.propertyChange(RenderDoneTracker.java:104)
      at ice.storm.StormBase.add(OEAB)
      at ice.storm.StormBase.sendViewportMessage(OEAB)
      at ice.storm.StormBase.stopViewportLoading(OEAB)
      at ice.storm.StormBase.stopLoading(OEAB)
      at coldfusion.document.DocumentProcessor.processContent(DocumentProcessor.java:149)
      at coldfusion.document.DocumentProcessor.ProcessContent(DocumentProcessor.java:59)
      at coldfusion.tagext.lang.DocumentTag.processContent(DocumentTag.java:1235)
      at coldfusion.tagext.lang.DocumentTag.doAfterBody(DocumentTag.java:1190)
      at cffilemanagement2ecfc930489299$funcCREATEPDF.runFunction(D:\Inetpub2\CustomTags\CFC\machI I\tempest\filemanagement.cfc:122)

      and there will be another thread every time that looks like this:

      "Image Fetcher 2" runnable
      - locked <142> (a java.io.BufferedInputStream)
      - locked <144> (a sun.net.www.protocol.http.HttpURLConnection)
      - locked <147> (a com.icesoft.util.trackers.RenderDoneTracker)
      at java.net.SocketInputStream.socketRead0(Native Method)
      at java.net.SocketInputStream.read(SocketInputStream.java:129)
      at java.io.BufferedInputStream.fill(BufferedInputStream.java:183)
      at java.io.BufferedInputStream.read1(BufferedInputStream.java:222)
      at java.io.BufferedInputStream.read(BufferedInputStream.java:277)
      at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:760)
      at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:711)
      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:635)
      at java.net.URLConnection.getContent(URLConnection.java:585)
      at java.net.URL.getContent(URL.java:927)
      at coldfusion.document.DocumentImageRetriever.processImageSrc(DocumentImageRetriever.java:11 7)
      at coldfusion.document.DocumentImageRetriever.fetch(DocumentImageRetriever.java:69)

      Notice how the first thread is waiting on thread 147 which is locked by the Imagefetcher thread.

      Ok, so images are my problem. But which ones?
      I started by loading the HTML into IE while running MS Fiddler to look for 404's or other errors but there were none. My cfdocument problems were sporadic and random so it was something which would work most of the time but not every time.

      I waited for the problem to happen again and used TCPView.exe from SYSInternals to view all my open TCP connections. Sure enough I had an open TCP connection every time from jrun.exe to the same remote server. The HTML being generated referenced an image on this remote server.
      I also found at this point, that if I used TCPView to kill the TCP connection, the thread would "un-hang"-- proving that cfdocument was waiting on the image to return.

      Now I fired up Ethereal, which is a packet sniffer and had it log all TCP traffic to and from that server for two days until a request hung again. I waited 15 minutes, then killed the TCP connection (which coaxed the thread into finishing) and then looked at my network trace. Sure enough, at the same time the cfdocument started, the following TCP conversation happened between JRUN and the remote server:

      JRUN: SYN (read: you there?)
      Remote: SYN ACK (read: yeah, why?)
      JRUN: ACK (read: just checking.)
      JRUN: HTTP GET host/path/image_name (read: hey, can I have this image)
      Remote: ACK (read: sure, coming right up)

      That was the last TCP frame in the stream until I killed it 15 minutes later. The remote server never responded, and the ImageFetcher function just sat there by the phone at "java.net.SocketInputStream.socketRead0" till kingdom come.

      Now, I have no control over the remote server, so don't even suggest I try to fix their problems. Anyone outside my firewall is outside of my control.

      I want to know:
      1) Why the Cold Fusion Timeout isn’t observed
      2) Why I can't kill the thread manually with SeeFusion
      3) Why the HECK cfdocument has no internal timeout for HTTP requests. (I let one run for 24 hours before and it just sat there)
      4) Is this a known issue?
      5) Is there a bug request in for it?
      6) When will it be fixed?
      7) Where the heck are Spain's plains located, and why won’t the rain leave it alone?

      If you have any information, think you have seen this problem, would like more information, or are just trolling the forums late at night looking for conversation, please weigh in on this.

      Thanks!

      ~Brad