3 Replies Latest reply on Jun 21, 2013 10:24 PM by sk150484

    Apache Tika PDF Parser not working in CQ 5.6

    sk150484

      Hi all,

       

      I was using following code to extract text from PDF in a CQ package.

      ContentHandler handler = new BodyContentHandler();

      Parser parser = new PDFParser();

      parser.parse(a.getOriginal().getStream(), handler, new Metadata(),

      new ParseContext());

      String text = handler.toString();

       

      This works perfectly in CQ 5.5 but in CQ 5.6 I get following exception:

      Caused by: java.lang.ClassNotFoundException: org.apache.pfbox.io.RandomAccess not found by org.apache.tika.parsers [58]

                at org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringI mpl.java:1499)

                at org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:75)

                at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl. java:1882)

                at java.lang.ClassLoader.loadClass(Unknown Source)

       

      Any ideas on why this class is not being found ?

       

      Thanks!