7 Replies Latest reply on May 17, 2018 8:59 AM by Jörg Hoh

    Unable to Prevent LinkChecking on Servlet Request

    luke.grover Level 1

      I have a simple servlet that should return some commerce data. I have simplified the example here... but the issue is that if there is a link in any of the properties then I get invalid JSON and the link checker filter appears to altering the response even though I have disabled it via configuration.

       

      package com.example.issue;
      
      
      import org.apache.sling.api.SlingHttpServletRequest;
      import org.apache.sling.api.SlingHttpServletResponse;
      import org.apache.sling.api.resource.Resource;
      import org.apache.sling.api.resource.ResourceResolver;
      import org.apache.sling.api.resource.ValueMap;
      import org.apache.sling.api.servlets.ServletResolverConstants;
      import org.apache.sling.api.servlets.SlingAllMethodsServlet;
      import org.json.JSONException;
      import org.json.JSONObject;
      import org.osgi.service.component.annotations.Component;
      import org.slf4j.Logger;
      import org.slf4j.LoggerFactory;
      
      
      import javax.servlet.Servlet;
      import javax.servlet.ServletException;
      import java.io.IOException;
      
      
      
      
      @Component(
          service= Servlet.class,
          property = {
                  SimpleCommerceServlet.RESOURCE_TYPE_DEFAULT,
                  SimpleCommerceServlet.SELECTOR,
                  SimpleCommerceServlet.METHOD_GET,
                  SimpleCommerceServlet.EXTENSION_JSON
          }
      )
      public class SimpleCommerceServlet extends SlingAllMethodsServlet {
      
      
          private static final long serialVersionUID = 1647028361800528653L;
          private static final Logger LOGGER = LoggerFactory.getLogger(SimpleCommerceServlet.class);
      
      
          public static final String RESOURCE_TYPE_DEFAULT = ServletResolverConstants.SLING_SERVLET_RESOURCE_TYPES + "=" + ServletResolverConstants.DEFAULT_RESOURCE_TYPE;
          public static final String EXTENSION_JSON = ServletResolverConstants.SLING_SERVLET_EXTENSIONS+"=json";
          public static final String METHOD_GET = ServletResolverConstants.SLING_SERVLET_METHODS + "=GET";
          public static final String SELECTOR = ServletResolverConstants.SLING_SERVLET_SELECTORS +"=getCommerceDetails";
      
      
          private ResourceResolver resourceResolver;
      
      
          @Override
          protected void doGet(SlingHttpServletRequest request, SlingHttpServletResponse response) throws ServletException, IOException {
              try {
                  resourceResolver = request.getResourceResolver();
                  Resource resource = resourceResolver.getResource("/etc/commerce/products/we-retail/custom/sample_product");
                  JSONObject jsonObject = new JSONObject();
                  ValueMap vm = resource.adaptTo(ValueMap.class);
      
      
                  jsonObject.put("Title", vm.get("jcr:title"));
                  jsonObject.put("Summary", vm.get("summary"));
      
      
                  response.setHeader("Content-Type", "application/json");
                  response.setCharacterEncoding("UTF-8");
                  response.getWriter().write(jsonObject.toString());
      
      
              } catch (JSONException  e) {
                  LOGGER.error("Failed to get and process JSON");
              } finally {
                  resourceResolver.close();
              }
          }
      }
      

       

      I have created a new product at /etc/commerce/products/we-retail/custom/sample_product with properties

       

      jcr:title = Sample Product

      summary =

      <p>This is a summary but it also has a <a title="Google" href="https://www" target="_blank">link</a>.</p>
      

       

      Calling the Servlet like:

      http://localhost:8080/bin/commerce.getCommerceDetails.details.json

      Results in invalid json

      {"Title":"Sample Product","Summary":"<p>This is a summary but it also has a <img src="/libs/cq/linkchecker/resources/linkcheck_o.gif" alt="invalid link: Google\\" title="invalid link: Google\\" border="0">link<\/a>.<\/p>\r\n"}

       

      Note the insertion of the linkcheck gif!

       

      I have tried using LinkCheckerSetting to ignore Internal and External.

      I have disabled link checking and rewriting via com.day.cq.rewriter.linkchecker.impl.LinkCheckerTransformerFactory

      I have changed the Link Check Override to ^. in com.day.cq.rewriter.linkchecker.impl.LinkCheckerImpl

       

      Any other ideas on how to prevent the Link Checker from changing the response of this Servlet?

       

      I'm running AEM 6.3 SP1

        • 1. Re: Unable to Prevent LinkChecking on Servlet Request
          smacdonald2008 Adobe Employee

          This looks like a bug. If you disabled Link Checker - it should not be affecting the servlet.

          • 2. Re: Unable to Prevent LinkChecking on Servlet Request
            smacdonald2008 Adobe Employee

            Check this to make sure you have done all that is required - Disable the CQ5 Link Checker

            • 3. Re: Unable to Prevent LinkChecking on Servlet Request
              Jörg Hoh Adobe Employee

              JSON requests are not rewritten by default. How is the rewriter configured on your system? Have you enabled it for json? In the webconsole you can check all rewrites at localhost:4502/system/console/status-slingrewriter

               

              Jörg

              2 people found this helpful
              • 4. Re: Unable to Prevent LinkChecking on Servlet Request
                luke.grover Level 1

                So are you suggesting that we need to have a custom sling rewriter to resolve? There isn't a rewriter for json but it does appear to get picked up by the default at the end of the list.

                 

                Here is the response from the link /system/console/status-slingrewriter ... these are just the OOTB rewriters as the instance doesn't have custom ones.

                Current Apache Sling Rewriter Configuration
                =================================================================
                Active Configurations
                -----------------------------------------------------------------
                Configuration hybrid-app
                
                
                Name : hybrid-app
                Content Types : [text/html]
                Paths : [/content/phonegap, /content/mobileapps, /content/campaigns]
                Order : 1001
                Active : true
                Valid : true
                Process Error Response : true
                Pipeline : 
                    Generator : 
                        htmlparser : {includeTags=[Ljava.lang.String;@1f9494e7}
                    Transformers : 
                        linkchecker
                        contentsync : {component-optional=true}
                        hybridapp : {component-optional=true}
                        mobileappscampaign : {component-optional=true}
                    Serializer : 
                        htmlwriter
                Resource path: /libs/mobileapps/config/rewriter/hybrid-app
                
                
                
                
                Configuration campaign-link-rewrite
                
                
                Name : campaign-link-rewrite
                Content Types : [text/html]
                Resource Types : [mcm/campaign/components/newsletter, mcm/neolane/components/newsletter, mcm/campaign/components/campaign_newsletterpage]
                Order : 1000
                Active : true
                Valid : true
                Process Error Response : true
                Pipeline : 
                    Generator : 
                        htmlparser : {includeTags=[Ljava.lang.String;@3479b3d5}
                    Transformers : 
                        campaign-link-rewrite : {component-optional=true}
                    Serializer : 
                        htmlwriter
                Resource path: /libs/mcm/config/rewriter/campaign-link-rewrite
                
                
                
                
                Configuration cfm
                
                
                Name : cfm
                Content Types : [text/html]
                Resource Types : [dam/cfm/components/contentfragment]
                Selectors : [rawcontent]
                Order : 1000
                Active : true
                Valid : true
                Process Error Response : true
                Pipeline : 
                    Generator : 
                        html-generator
                    Transformers : 
                        cfm-payload
                        cfm-parfilter : {component-optional=false}
                        cfm-assetprocessor : {component-optional=false}
                    Serializer : 
                        htmlwriter
                Resource path: /libs/dam/config/rewriter/cfm
                
                
                
                
                Configuration pdf
                
                
                Name : pdf
                Extensions : [pdf]
                Order : 0
                Active : true
                Valid : true
                Process Error Response : false
                Pipeline : 
                    Generator : 
                        empty-generator
                    Transformers : 
                        htmlparser
                        xslt : {source=sling://libs/wcm/core/content/pdf/page2fo.xsl}
                    Serializer : 
                        fop : {mime-type=application/pdf}
                Resource path: /libs/cq/config/rewriter/pdf
                
                
                
                
                Configuration default
                
                
                Name : default
                Content Types : [text/html]
                Order : -1
                Active : true
                Valid : true
                Process Error Response : true
                Pipeline : 
                    Generator : 
                        htmlparser
                    Transformers : 
                        linkchecker
                        mobile : {component-optional=true}
                        mobiledebug : {component-optional=true}
                        contentsync : {component-optional=true}
                    Serializer : 
                        htmlwriter
                Resource path: /libs/cq/config/rewriter/default
                

                 

                So, if remove the transformers from default I still get an invalid response

                {"Title":"Sample Product","Summary":"<p>This is a summary but it also has a <a title="\" href="Google\\" target="https://www\\">link<\/a>.<\/p>\r\n"}
                • 5. Re: Unable to Prevent LinkChecking on Servlet Request
                  luke.grover Level 1

                  So thanks to Jörg Hoh's comment, I added a new sling rewriter configuration for this case as it seems the default rewriter configuration was being used which tries to rewrite the html tags in the json response.

                   

                  So I ended up with this:

                  <?xml version="1.0" encoding="UTF-8"?>
                  <jcr:root xmlns:sling="http://sling.apache.org/jcr/sling/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0"
                            jcr:primaryType="nt:unstructured"
                            contentTypes="[application/json,text/html]"
                            enabled="{Boolean}true"
                            generatorType="htmlparser"
                            order="{Long}1"
                            selectors="[getCommerceDetails]"
                            serializerType="htmlwriter"
                            paths="[/bin/commerce]"
                            transformerTypes="[]">
                      <generator-htmlparser
                        jcr:primaryType="nt:unstructured"
                        includeTags="[NOT_A_REAL_TAG]"/>
                  </jcr:root>
                  

                   

                  So, a few takeaways from this...

                  1. If utilizing the htmlparser like this, you must specify a value for the includeTags otherwise, the configuration does nothing.
                  2. I specified application/json in the contentTypes but the content type always seems to be text/html
                  3. I didn't need to use selectors and paths but I wanted to be really specific on this, so I can see working vs failing if I change the path slightly. The servlet is based on selector so just having selector would probably be the best solution to match the servlet.
                  4. Not having transformers breaks the UI of /system/console/status-slingrewriter ... it doesn't seem to know how to handle an empty list. So the question is where else does this cause problems that I haven't seen yet?!? I've thought about adding a non-transforming transformer just to satisfy the need for a transformer so I can assure myself and others that I'm not breaking anything outside of my set path/selector combo.
                  1 person found this helpful
                  • 6. Re: Unable to Prevent LinkChecking on Servlet Request
                    Jörg Hoh Adobe Employee

                    Hi Luke,

                     

                    thanks for letting us know how you resolved your issue. Regarding the finding 4 (not having a rewriter breaks the console) I would ask you to raise a ticket at the Sling Jira. That should not happen :-)

                     

                    Regards,

                    Jörg

                    • 7. Re: Unable to Prevent LinkChecking on Servlet Request
                      Jörg Hoh Adobe Employee

                      Hi Luke,

                       

                      thanks for reporting it and even providing a patch!

                       

                      Jörg