4 Replies Latest reply on Oct 4, 2012 10:22 AM by justin_at_adobe

    25,000 node dataset?  Bad idea?

    biosopher

      I'm still hesitant to upload a large dataset into the JCR so thought I'd ask for other's experience/input.

       

      1. What size datasets do you have?

      2. How does the CRXDE handle such a large dataset?  Seems the file explorer UI would crash if I opened a node with that many children.

      3. How is query performance?  MySQL is handling this dataset with ease.

        • 1. Re: 25,000 node dataset?  Bad idea?
          biosopher Level 1

          I should clarify that this is a top level node containing 25,000 children.  From what I've seen on the web, this is a bad idea as the implication seems to be that only 10-15 nodes should be present at each level.

           

          As an example, think about a book of recipes.  In a relational DB, you could easily have a table of 25,000 recipes.  It seems this doesn't work for a JCR as having a 'recipes' node with 25,000 'recipe' children would overwhelm the JCR.  If that so, seems I need to look into linking CRX to a relational DB unless someone can offer suggestions on dealing with the 25,000 child recipes.

          • 2. Re: 25,000 node dataset?  Bad idea?
            justin_at_adobe Adobe Employee

            biosopher wrote:

             

            From what I've seen on the web, this is a bad idea as the implication seems to be that only 10-15 nodes should be present at each level.

            I don't think this accurate. The most common number you will see is to not have more than 1000 nodes per hierarchy level.

             

            There are a variety of factors that lead into this. In general, you will run into UI limitations well before raw repository performance becomes the primary factor.

             

            For your use case, I would think there are a handful of ways to divide these - first letter, primary cuisine, date, etc.

             

            Query performance should not be impacted by node structure.

             

            Regards,

            Justin

            • 3. Re: 25,000 node dataset?  Bad idea?
              biosopher Level 1

              Thanks Justin.  My challenge is to convince die-hard relational guys that JCR is a viable alternative.  I'd have to agree with them that limiting to 1,000 nodes at a hierarchy level by breaking up the dataset is a hack.  That said, the power of having the dataset integrated into CQ5 is a counter-weight to that hack.

              • 4. Re: 25,000 node dataset?  Bad idea?
                justin_at_adobe Adobe Employee

                To be clear, this really has very little to do with JCR and more to do with the tools.