1 Reply Latest reply on Oct 25, 2011 2:31 AM by G. Walt

    CRX Max Size


      Hello folks,

      We are looking to build our strategy for coming years with WEM. We are trying to find what the max CRX size that CQ can support. I ran across the following post  http://dev.day.com/content/kb/home/Crx/CrxFAQ/CrxLimitation.html which basically say there is no limit. Even if that is true, still at a certain point other operations (such as backups or restores) will start to fail or slow down.


      Please share what you know about CRX limits? Have you ran into a slowness of the overall CQ solution due to crx size (even if the slowness is due the overall framework and not the CQ product.)


      Have you considered virtual repository http://dev.day.com/docs/en/crx/current/administering/virtual_repository.html



        • 1. Re: CRX Max Size
          G. Walt Adobe Employee

          Hi Lior,


          This is correct, there is no particular limitation for the repository size. By default, files larger than 4096 bytes are not saved in the Repository Storage (typically the TAR file), but as plain files on the file system by the Data Store (see Data Store documentation). The Persistance Manager then only stores a file hash in the Repository Storage, so that if a file appears at multiple locations in the repository, it is only saved once on the file system. So really, the limitating factor is mainly the disk drive.


          Here are a few additional things that might be useful when working with large data sets:

          • The Persistance Manager works incrementally, which offers optimal performance, but in case of many content modifications, there will be  some useless content stored. In case this takes too much disk space, you should consider TAR optimization and Data Store garbage collection.
          • Making a package of some content duplicates the content into the package, so it roughly duplicates the amount of disk space needed (and package files also end up in the Data Store).
          • If you start uploading a lot of content to your instance, you should have optimized workflows to do only what is needed (basically, each workflow rendering an asset will need to load the asset in memory). If you want to optimize the  data in your repository, you could also remove the original asset once it's renditions have been done.
          • You should avoid having content structures with more than a thousand of nodes on the same level in the hierarchy (as you already know).
          • If you have a lot of content that doesn't need to be searchable, configure your search indexing accordingly.


          But finally I'd like to say that when working with very large data sets, it is always a good idea to setup a test environment with realistic content and test the different cases, such as feeding new data into the system, authoring, web access, replication, backup, etc. In case you reach some  limits, we will work with you to identify and optimize the bottlenecks.