8 Replies Latest reply on Feb 6, 2013 2:21 AM by MeasurableBusinessResults

    No cache invalidation when statsfilelevel > 0 and JCR Resource Resolver URL Mapping being used

    lancedolan Level 1

      IF  :

      • JCR Resource Resolver [1] is used to allow requests missing the initial /content

             AND

      • Dispatcher has a /statsfilelevel greater than 0

       

      THEN

       

           It is impossible for dispatcher to invalidate the cache.

       

       

      EXAMPLE:

       

      1. Content in CRX has /content/mysite/mylanguage/mypage
      2. Request comes in to dispatcher for /mylanguage/mypage.html
      3. CQ serves /mylanguage/mypage.html successfully
      4. Dispatcher saves the cache at /mylanguage/mypage.html
      5. Author server sends activation request for /content/mysite/mylanguage/mypage
      6. Dispatcher recieves the activation request and touches stat file at :
        1. /content
        2. /content/mysite
        3. /content/mysite/mylanguage
      7. Request comes in to dispatcher for /mylanguage/mypage.html
      8. The stat file at /mylanguage has never been touched, and statsfilelevel > 0, so /mylanguage/mypage.html is considered valid
      9. Dispatcher returns the cached file at /mylanguage/mypage.html

       

      I have seen this suggestion from DAYCARE [2], but it says to rewrite the URL so that the cache is still stored in full context. We can't do this for multiple technical reasons, and so I'm wondering if, instead, there is a way to make the {path} variable in the extended replication agent options [3] intelligent enough to strip activation path URI based on URL Mappings, the same way the Link Rewriter rewrites links based on these exact settings. We really need the activation path to change rather than the request path in the web server.

       

       

      [1]: Go to http://localhost:4502/system/console/configMgr and click Apache Sling JCR Resource Resolver. Then add a URL Mapping such as /content/mysite/-/

       

      [2]: http://dev.day.com/content/kb/home/cq5/CQ5SystemAdministration/HowToFlushMappedContent.htm l

       

      [3]: Go to http://localhost:4502/etc/replication/agents.author/flush.html , click edit, and go to Extended tab

        • 1. Re: No cache invalidation when statsfilelevel > 0 and JCR Resource Resolver URL Mapping being used
          Jörg Hoh Adobe Employee

          For these cases I have 2 dispatcher farms configured:

          1. One for serving requests, where the docroot is configured to /var/dispatcher/htdocs/content
          2. One for invalidation requests, where the docroot is configured to /var/dispatcher/htdocs

           

           

          Farm 1 is configured with the proper /virtualhosts section. For the second farm (for invalidation) you can choose an arbitrary hostname (like "zzz_invalidation.local"). And the order matters: If a request does not match any farm, the last farm definition wins.

           

          So, by default an invalidation request does not have any host header, thus it does not match to any farm and is therefor handled by the "zzz_invalidation.local" farm. I already tried to add custom http headers to the invalidation agent ("Host: zzz_invalidation.local") and put the invalidation farm not at the end, but specifying that didn't work for me. It only worked with the zzz_invalidation.local farm being the last definition.

           

          For this case you don't need to deal with CQ side settings.

           

          Jörg

          • 2. Re: No cache invalidation when statsfilelevel > 0 and JCR Resource Resolver URL Mapping being used
            cmhdave73

            Here is how we had to resolve this (assuming an Apache server).  We did end up having to cache the content in the full path which I know you mentioned you are unable to do.

             

            http://www.wemblog.com/2012/07/how-to-use-dispatcher-with-mapped.html

            • 3. Re: No cache invalidation when statsfilelevel > 0 and JCR Resource Resolver URL Mapping being used
              aschermuly

              Sad, that one has to apply such ugly tricks as Jörg did for such a simple and common task. [2] applies to 5.2 and 5.3 which means the problem also is known for quite some time.

               

              I usually used cmhdave's technique, however these days i had the idea of manipulating the header via apache's mod_header in a fashion like this. To understand what's happening: The HTTP-Request issued by the publish (or author) to the dispatcher is a request with an empty body but with some headers. The important header is "CQ-Handle" wich carries information about the handle to be activated. So just add

               

                   RequestHeader edit CQ-Handle /content(/.*) $1 early

               

              to the virtualhost handling the dispatcher invalidation requests.. and voila: No nasty "/content" to be found in the dispatcher-debug.log anymore:

               

              [Wed Oct 17 18:15:48 2012] [I] [27427(47063570531184)] activation detected!! cq-action=Delete [/company/de/home/stuff]

              [Wed Oct 17 18:15:48 2012] [I] [27427(47063570531184)] Touched stat file '/var/dispatcher/docroot/company/de/home/stuff/.stat'

              [Wed Oct 17 18:15:48 2012] [I] [27427(47063570531184)] Touched stat file '/var/dispatcher/docroot/company/.stat'

              [Wed Oct 17 18:15:48 2012] [I] [27427(47063570531184)] Touched stat file '/var/dispatcher/docroot/.stat'

              [Wed Oct 17 18:15:48 2012] [D] [27427(47063570531184)] FlushCaches [...]

              [Wed Oct 17 18:15:48 2012] [D] [27427(47063570531184)] cache flushed

               

              Cheers Achim

               

              P.S.: I would have rather liked Adobe to build in some regex handling either in the flushing agent or the dispatcher config....

              • 4. Re: No cache invalidation when statsfilelevel > 0 and JCR Resource Resolver URL Mapping being used
                Yogesh Upadhyay Level 4

                That looks promising .. But make sure that you have similar constraint for DAM as well, As even that handle will start with /content.

                 

                Yogesh

                • 5. Re: No cache invalidation when statsfilelevel > 0 and JCR Resource Resolver URL Mapping being used
                  Scott Brodersen Adobe Employee

                  Another option is to "manually" flush the cache by sending an HTTP request to Dispatcher. In doing so, you have the opportunity to customize the headers, inject logic to conditionalize them, etc:

                   

                  http://dev.day.com/docs/en/cq/current/deploying/dispatcher/page_invalidate.html#Manually Invalidating the Dispatcher Cache

                   

                  Perhaps a node change can start a workflow that creates and sends the http request. I haven't thought this through but it's an idea.

                   

                  scott

                  • 6. Re: No cache invalidation when statsfilelevel > 0 and JCR Resource Resolver URL Mapping being used
                    orotas Level 4

                    Something I have done in the past (althought not with CQ 5.5) is to create a custom Dispatcher cache flushing agent that extends the standard one. You can then add logic to send the correctly mapped URLs to dispatcher (potentially using multiple requests if a resource in the respository can map to multiple possible paths in the cache root). You should be able to use the resource resolver to figure out what all the valid variations are. You would probably have to create or extend a TransportHandler & a ContentBuilder (you might be able to extend the existing Dispathcer Flush ones)  and also create a flush agent template and component.

                    • 7. Re: No cache invalidation when statsfilelevel > 0 and JCR Resource Resolver URL Mapping being used
                      Yogesh Upadhyay Level 4

                      Found that there is one problem with

                       

                      RequestHeader edit CQ-Handle /content(/.*) $1 early

                       

                      images/component data under par get stored under content/urpath/_jcr_root/par and it does not go through resource resolver rule for some reason.

                       

                      So for example if you access page yoursite/mypage.html you will have following files in cache under docroot

                      mypage.html and content/mypage/_jcr_root/par/anycomponent/data and when you activate that page only mypage.html will be cleared and not content/mypage/_jcr_root/par/anycomponent/data because of above header setting.

                       

                      Hopefully it make sense.

                       

                      Yogesh

                      • 8. Re: No cache invalidation when statsfilelevel > 0 and JCR Resource Resolver URL Mapping being used
                        MeasurableBusinessResults Level 1

                        This will not work, because by default the jcrresolver only shortens some URL patterns (*.html), but not all.

                        So DAM URLs and stuff beneath a page (jcr_content/*) like images will still have a URL starting with /content.

                         

                        Also Jörgs suggestion is suboptimal, because if your docroot is /var/dispatcher/htdocs/content, then other URL patterns like /etc/site/something will get cached beneath

                        /var/dispatcher/htdocs/content/etc

                        which again will not work with the flush request even with the second farm configuration. So if you use dispatcher flushing for anything outside /content it will not work.

                         

                        The best solution is to use always cache files with the full correct path as in CQ, the way to achieve that is to use Apache rewrite rules to reverse the effect of the jcrresolver mapping for exactly those URLs and domains only, e.g.:

                         

                          RewriteCond %{REQUEST_URI}     !^/content     [NC]
                          RewriteRule     ^/(.*\.(html|rss|pdf|json).*)$     /content/example_com/$1     [PT,L]
                        

                        You need to match the file extensions here with what you are actually caching and mapping of course.