6 Replies Latest reply on Jul 5, 2007 4:17 PM by Rothrock

    300,000+ XML Nodes

      Hi all

      // begin sarcasm
      So I have this little XML file, with 300,000+ nodes (318,006 lines in this XML code to be exact).
      // end sarcasm

      Right now, I'm loading this monster in. When I finish loading, I want to take all these nodes and save them to an Object, so I can easily reference them later.

      The way I'm doing this now is with a for-loop, which in the end, gathers about 5000 nodes, each with anywhere between 10-30 children.

      As best as I can tell, I have gotten this working successfully to this point. The problem I'm having now is that with this much data, Flash chokes on it a bit. So after my load sequence, there's about 5 or 6 seconds where nothing happens, then everything suddenly works.

      Now granted, this test XML I'm using is a worse-case-scenario. And I think 5 or 6 seconds isn't bad. But I'm on a fairly fast and new machine and I'm worried about a lesser machine's performance.

      So is there a better way to do this? Should I not use a for-loop? Should I not save this data much data to an Object?

      Any suggestions are welcome.

        • 1. Re: 300,000+ XML Nodes
          Greg Dove Level 4
          Shouldn't that be ?
          So I have this little XML file, with 300,000+ nodes (318,006 lines in this XML code to be exact).

          Its certainly a hefty load. I've never tried working with data that large before.
          I'd run through a few things for testing...
          Isolate timings with getTimer to understand how long the whole process is taking.
          load time: difference between load command issued and onData
          parsing time: difference between onData (see the livedocs) and (you need to call it explicitly) onLoad
          processing time: your looping and extraction of portions. You're right this is likely to be the part that takes a long time if you're looping through and possibly recursively down the xml structure to pick out data.

          Questions I would ask myself:
          -would it make sense (if possible) to have the source xml provided in bite-size chunks?
          -does XPathAPI or xfactorstudios xpath offer any improvement for the search and extraction part of what you're doing? This way you may be able to separate the node location logic and the new object representation. In saying that, its also possible (and is much easier with an XPath implementation) to make use of the data directly in its XML object representation.

          Nothing concrete there I'm afraid... but you said that all suggestions were welcome...and I really wanted to put the bit about the sarcasm tags at the beginning.
          • 2. 300,000+ XML Nodes
            ChrisFlynn Level 1
            Ive seen sarcasm tags both ways.

            I'm actually working on an attempt with onEnterFrame instead of for-loops. My hope is with the onEnterFrame, I'd be able to incorporate some sort of Process Bar that will at least let the user know something is going on. The for-loop won't really let you do that.

            One of my thoughts was to do try and get this XML data into smaller bits. But since I'm not actually developing the XML end of things, that's nothing I have the final say so in. Plus, the potential of chasing around 5000 separate XML files is nothing we want to deal with.

            The XML we're using is dynamically created from a database, and apparently is fairly complex to get it the way we have it now. That might have to wait till version 2.0.

            Thanks again GWD
            • 3. Re: 300,000+ XML Nodes
              Rothrock Level 5
              I would definitely recommend trying it on an older slower computer. You can certainly break up the part where you are searching through like you have suggested, but the initial parse by Flash is not something you can break up and I've seen folks post here about getting the "slow script" error during that part.

              If the file is dynamically created from a database it should be fairly easy to break it into chunks which could be loaded, parsed, and then appended together in Flash. It would break up some of that initial parse.
              • 4. Re: 300,000+ XML Nodes
                Greg Dove Level 4
                I've never tried anything like this... but if the internal parsing by flash is a major factor and you are not able to get the source split into smaller chunks.....

                Perhaps its possible to break up the parsing of the raw xml string from inside your onData handler... e.g. it would presumably work best with well spaced, infrequent closing tags from a child of the root node (that does not occur any lower in the hierachy, anywhere) to use for a String.split and would also require tweaking each subsequent string element to add back the closing tag used for the split and closing/opening root tags - either or both, depending on its index in the array of strings- before parsing each bit...
                I don't know if its practical or makes sense but its something else to explore as an idea/option. Its a definite second to breaking it down at the source, but it may be another option.
                • 5. Re: 300,000+ XML Nodes
                  ChrisFlynn Level 1
                  You know, the onEnterFrame method seems to have worked for me, but its much much slower. I think its slower, because I'm actually displaying what node is being processed, as its being processed. So it looks like:

                  0 of 5000 processed
                  1 of 5000 processed
                  2 of 5000 processed


                  5000 of 5000 processed.

                  Of course with that many, it took a full 2 minutes, but it didn't freeze up.

                  My hope is that with using setInterval, and not displaying each one as they're processed, this will go much faster and it won't lock up.

                  Thanks again
                  • 6. Re: 300,000+ XML Nodes
                    Rothrock Level 5
                    I wouldn't necessarily do one node on each frame. You will probably be wasting time then. Certainly there is some chunk size that can be handled quickly in a frame or a setInterval. I would probably try and process them in chunks of 1000 or 5000 on each frame and see how that went.