6 Replies Latest reply on Sep 3, 2010 9:05 AM by mcdnlt

    Files with many objects -- CS4 slows to the point of being unusable

    jf_moyen

      Good evening,

       

      I'm currently trying to work with a file that is not terribly huge, but does contain many objects; specifically, it's a set of some 30 scientific diagrams, each of them containing 3500 data points. The diagrams were ultimately generated by R (a statistical package, and I do not have terrible control over the kind of output it generates) as a pdf, that I eventually managed to import into CS4. Each data point can be a letter ("o"), a simple object (such as a triangle), or a more complex object (a cross "+", made of two segments).

       

      The pdf file is about 1.5 Mo (disk size). The same converted to eps is 13 Mo, and when saved as a .ai in CS 4, it balloons to 104 Mo. Well, space is not much of an issue here, I have a reasonable disk and 4 gigs of memory anyway.

       

      However, I need to edit the file to tweak it a bit, by changing the labels, adding titles, etc. Opening the file is a mission in itself (takes 3-4 mn); after which, doing anything at all is close to impossible. The hand tool is all but unusable; selecting an object takes a few seconds and causes a noticeable break. Selecting all objects by same contour/background is a joke; dragging one part of the image.... Well, let's not even talk about it. In a nutshell -- CS4, in this situation, is slow to the point of being unusable.

       

      I did try to work in "outline" mode (as opposed to "preview", sorry I have a French version -- not sure about the translation, ctrl+Y anyway); I did remove the thumbnails in the "layer" window; little improvement, it at all.

       

      I would appreciate any hints to speed up the process a bit -- of course, I could cut my file in smaller chunks and process each individually, but I'm doing a few repetitive operations (select all points one colour, change it to some other colour matching a colour code, etc...) and it would be much more convenient to do all of it once and for all.

       

      Thanks !

        • 1. Re: Files with many objects -- CS4 slows to the point of being unusable
          Mylenium Most Valuable Participant

          You may have some luck with James' scripts to unify some of the paths and clean up elements (http://www.illustrationetc.com/AI_Javascripts/PathScripts.htm). Scientific programs have a tendency to realyl spit out every element as a bunch of separate lines. Uniting some of them would greatly improve matters. Once you have successfully selected data points, you may also wish to replace them with a symbol to facilitatte any subsequent editing of their shape and appearance.

           

          Mylenium

          • 2. Re: Files with many objects -- CS4 slows to the point of being unusable
            JETalmage-71mYin Level 3

            JF,

            Before you embark on a wild-goose-chase: I don't have any scripts immediately applicable to what I read in your description.

             

            More information is needed:

             

            Each data point can be a letter ("o"), a simple object (such as a triangle), or a more complex object (a cross "+", made of two segments).

             

            But what kind of actual Illustrator-native objects are they? Are the "letters" actual text objects or compound paths, or a pair of simple paths, or a raster image? Are the "triangles" single closed paths or three open paths? Are the crosses two separate, two-point paths that actually cross each other, or are they four open paths? Are any of these "objects" actually imported as Groups? Are the three different "data points" different colors? Specifically what kind of AI-native objects they are, and their AI-native attributes, will largely determine the possible ways to select them or to target them with a script.

             

            Your description doesn't even say you want to edit or replace the data point graphics (though it sounds like it might be advantageous to do so). It says "I need to edit the file to tweak it a bit, by changing the labels, adding titles, etc." What labels? How many labels? Are you talking about a lable at each "data point," or some kind of legend?

             

            Is this a regularly-recurring project, or do the 30 diagrams constitute a one-shot, one-time project?

             

            Post a link to a screenshot of the original and of what you want to end up with. There may be other workarounds. Given that the nature of your core issue is performance in opening--not only editing--the file, posting a sample of an actual AI file would be helpful to anyone actually trying to help you concoct a solution.

             

            Both the advantage and disadvantage of scripting is: Scripts are usually written for very specific situations, and are usually not drop-in solutions for even similar-sounding, but slightly different situations. Usually, one-shot uses do not warrant the time/effort of working out a script. Sometimes an existing script can be easily modified for a slightly-different set of specifics. (But one has to know those specifics.)

             

            An open related aside to all: Just one of the many advantages of FreeHand's Graphic Find & Replace feature, by the way, is the ability to find and select paths that are identical throughout a document. Any FH user worth his salt will agree that GF&R would be one of the most valuable features of that historically superior program to be emulated in Illustrator--far more valuable (in my opinion) than most of the new features of CS5 (with the arguable exception of Perspective Grids) put together.

             

            JET

            • 3. Re: Files with many objects -- CS4 slows to the point of being unusable
              jf_moyen Level 1

              Thanks for the long reply; before giving you further details on what I'm trying to do, let me state that for this time, I managed to manually "clean" my file; it took me a good few hours anyway, and as I'm processing this type of figures quite often, I'd be happy to streamline my workflow, at least for next time.

               

              I'm plotting data from large databases (routinely 1000 - 5000 data points), and examining the data using a variety of plots (not all of them are binary). Typically, I'll draft some 20-30 plots, that I of course want to be homogeneous. The plots are made using an extension (called GCDkit) to a (free) stat software called "R" (http://www.r-project.org/). R is able to export its plots to pdf -- although the pdf is not terribly well-behaved. I then "merge" all the pdf plots by "printing" them all on one page using "Adobe pdf" printer in Acrobat. Finally, all the plots are opened into illustrator, for a few final steps:

               

              - Change font/size etc. for axis labels and scales;
              - Move each group of symbols (the red circles, the blue squares...) to its own layer, to allow easy generation of different plot with/without certain type of data;
              - Change some of the symbols (for instance make the red circles white with a orange stroke);
              - Correct graphical mistakes;
              - Annotate plots (add an arrow here, a comment there...).

               

              But what kind of actual Illustrator-native objects are they? Are the "letters" actual text objects or compound paths, or a pair of simple paths, or a raster image? Are the "triangles" single closed paths or three open paths? Are the crosses two separate, two-point paths that actually cross each other, or are they four open paths? Are any of these "objects" actually imported as Groups? Are the three different "data points" different colors? Specifically what kind of AI-native objects they are, and their AI-native attributes, will largely determine the possible ways to select them or to target them with a script.

               

              Most of them, depending on the specific symbols. For instance, the circles are a lower case "o" (a text object). A filled circle is actually *two* "o"'s, one with a fill and no stroke and one with a stroke and no fill (duh). The triangles are one path -- an open path, too, it does not even close (ie, it is made of two segments). The squares, as far as I remember, are a closed path (4 segments). The crosses (and a few more complex symbols) are a couple of path (one-segment path) interesecting, and I'm not even sure they are grouped.

               

              Grouping is haphazard, too. Some symbols of one class may be part of a group -- but then some other symbols of the very same class are part of a different group.

               

               



              Your description doesn't even say you want to edit or replace the data point graphics (though it sounds like it might be advantageous to do so). It says "I need to edit the file to tweak it a bit, by changing the labels, adding titles, etc." What labels? How many labels? Are you talking about a lable at each "data point," or some kind of legend?

               

              Yeah sorry I was not clear -- in fact, it can be each and every of them. I partially replied that above. Labels would be axis labels. I may need to alter the data points graphics (mostly by changing color to match an other color code); what I'm mostly trying to do is (a) to simplify my file, as R generates all sort of useless objects (more on that below); (b) to move each type of points (= distinct symbols) to its own layer. and (c) to change font/size of axis labels, add a few comments maybe, etc.

               

              Is this a regularly-recurring project, or do the 30 diagrams constitute a one-shot, one-time project?

              Fairly regularly recurring...



              Post a link to a screenshot of the original and of what you want to end up with. There may be other workarounds.

               

              Sure -- a couple of files are linked. Of course, I can not locate the real files, original and processed, but I'm reasonnably confident these two should give you a good idea. The pdf is what R gives me; the .ai is what I want to do.

               

              the illustrator file (final) : http://dl.free.fr/usMuKi8Pd

              the original pdf : http://dl.free.fr/cEH9GDWma

               

              Most differences are not visible. The key issue (and I believe, the one at the core of my problem) is that R generates awful pdf's. Aphazard grouping, symbols made of random objects, axes labels with crazy layout, fonts that do not work in ai (the red things in the pdf below should be "o"'s, but somehow they are declared in a font that AI cannot read). Also, there are lots and lots of invisible objects: clipping masks, bounding boxes, or just invisible squares. Presumably, this is what causes most of my troubles.

               

              Both the advantage and disadvantage of scripting is: Scripts are usually written for very specific situations, and are usually not drop-in solutions for even similar-sounding, but slightly different situations. Usually, one-shot uses do not warrant the time/effort of working out a script. Sometimes an existing script can be easily modified for a slightly-different set of specifics. (But one has to know those specifics.)

               

               

              Indeed, I'm not sure scripting is my solution (somebody else suggested it -- not me). What I'm really after are some pointers, maybe some ways to clean the file semi-automatically (in a few clicks) to bring it back to a more manageable state ?

               

              An open related aside to all: Just one of the +many+ advantages of FreeHand's Graphic Find & Replace feature, by the way, is the ability to find and select paths that are identical throughout a document. Any FH user worth his salt will agree that GF&R would be one of the most valuable features of that historically superior program to be emulated in Illustrator--far more valuable (in my opinion) than most of the new features of CS5 (with the arguable exception of Perspective Grids) put together.

               

               

              Well, I dare say I could use such a feature myself, but alas that's besides the point (for now ?)

               

              Thanks !

              • 4. Re: Files with many objects -- CS4 slows to the point of being unusable
                JETalmage-71mYin Level 3

                The problem sounds interesting. I don't speak the language of the other link, and couldn't find a button on the page to download the native AI file. But I downloaded the PDF and opened it in AI.

                 

                Here are my initial thoughts, for what it's worth:

                 

                The file has less than 2000 paths. 222K on disk. That's not a complex file, and I don't see any performance problems at all with it.

                The data point markers are all simple paths, not grouped.

                The file has the usual handfull of awkward and unnecessary clipping masks, but that's easy to remove.

                The rest of the content (axis values, etc.) is just ordinary pointType objects, as one would expect.

                 

                function(){return A.apply(null,[this].concat($A(arguments)))}

                Change font/size etc. for axis labels and scales

                That's easy: Marquee-select the whole row of labels across an axis, select a different size or other text attributes in the Characters Palette. Better (for recurring use) define Character Styles with all the desired formatting, and you can thereafter reformat the selected text objects with one click.

                function(){return A.apply(null,[this].concat($A(arguments)))}

                Move each group of symbols (the red circles, the blue squares...) to its own layer

                 

                The markers are not grouped, but each marker type seems to be a different color, and the elements of each are all the same color. So you can easily isolate them to separate layers by:

                 

                1. Layer Palette: New Layer. Name it "GreenCrosses."

                2. Select one path of a green cross marker.

                3. SelectMenu>Same>StrokeColor.

                4. Group (not really necessary, but handy).

                5. Layer Palette: Drag the selection marker (the little square at the right end of the Layer 1 listing) to the same spot on the listing of the new Layer.

                6. Repeat for the other-colored markers.

                 

                You could semi-automate that process by recording an Action (macro).

                 

                function(){return A.apply(null,[this].concat($A(arguments)))}

                Change some of the symbols (for instance make the red circles white with a orange stroke);

                Now that they're isolated on their own layers--and therefore easily selectable as a set--just select the Layer or Group and change its fill/stroke Appearance(s).

                Also familiarize yourself with the Object>Transform>TransformEach command. The markers are all too large; they clump together too much visually. TransformEach lets you apply the same scaling percentage to each selected element, without affecting its position. (Do this while they are ungrouped; a Group will be considered one object by TransformEach.)

                 

                function(){return A.apply(null,[this].concat($A(arguments)))}

                Correct graphical mistakes;

                 

                Depends what kind of graphical mistakes you mean.

                 

                function(){return A.apply(null,[this].concat($A(arguments)))}

                Annotate plots (add an arrow here, a comment there...).

                 

                If done in Illustrator, set up a Graphic Style(s) and/or Brushe(s) which you can consistently apply with a click to the annotations. If done in the PDF, and if you have access to Acrobat or Acrobat Pro, those two programs provide an annotation toolset.

                 

                The above (at least in the provided sample) should be a matter of minutes, not hours.

                 

                Having said all that, more general thoughts:

                 

                Illustrator has a rudimentary graphing toolset, and includes scatter plots. Not sure it can handle your data, or do everything you want to do with it, but I mention it in case you don't know.

                 

                Again, the file is not large or complicated. But in Illustrator, whenever you have a kazillion instances of the same object in the same document (your datapoint markers), it just makes good sense, if you can, to use Illustrator Symbols. Symbols are graphics stored in a library. Instances of a Symbol on the page are mere position/scale references to the single stored Symbol, so using them minimizes file size. This file is already quite small. But using Symbols would allow you to do things like select all instances at once, replace them with other Symbols, etc. (But not sure you even need that.) The problem would be that your markers are not single objects or Groups. So you would have to devise a means by which to isolate what constitutes a particular kind of marker; and that would probably require a scripting workaround.

                 

                However, I don't understand this part:

                 

                function(){return A.apply(null,[this].concat($A(arguments)))}

                function(){return A.apply(null,[this].concat($A(arguments)))}function(){return A.apply(null,[this].concat($A(arguments)))}I then "merge" all the pdf plots by "printing" them all on one page using "Adobe pdf" printer in Acrobat.

                 

                That implies to me that you are first creating an individual graph of each marker type. If that's the case, it raises the question: With the program you're using to create one of the initial graphs, do you have a choice as to the marker used? For example, could it be a single-path of any shape? Could it be a simple single text character or number? If so, that would render trivial from a scripting perspective the problem of targeting all instances of a particular marker and replacement of them with a Symbol of your own design.

                 

                Taking that one step further, though: If the initial data you are working with exists as, or can be represented as ordinary tab-delimited data, as in a spreadsheet, it makes one wonder if your whole workflow is not more complicated than necessary. Several approaches come to mind:

                 

                As far as Illustrator is concerned, scripting the entire creation of a graph like this directly from the raw data values--including the value scales, titles, legend, etc.,  would not be difficult, used in conjunction with a template pre-populated with the desired Symbols and Styles. The resulting graph wouldn't be "live" wherein you could just import fresh data into the same graph; but that's not true of your existing solution, either. Graphing a new dataset would be a simple matter of re-running the script.

                 

                Illustrator's XML-based Variables feature includes Graph Data as one of the four kinds of objects that can be bound as a Variable Object. So if your data can be expressed as a suitably-formed XML file, that might be a solution for an updateable "live" graph.

                 

                The big new feature of FileMaker Pro 11 is built-in graphing. FMP can export directly to print-suitable PDF. The graphing feature is no-nonsense; that is, it's not real sophisticated from an "artwork" standpoint. But it can do marvelous things when combined with FMP's data-handling capabilities like reporting and summary functions.

                 

                The one caveat I see relative to the three last thoughts is that orange blob in your sample. I assume that's a region surrounding the set of values in a particular range. That kind of thing isn't built into things like Illustrator's or FileMaker's basic graphing features. It could possibly be included in a scripted solution, given the repeating or user-defined parameters. I imaging such things are built into more robust dedicated math/graphing applications.

                 

                [The reference to FreeHand's GF&R feature was more for the benefit of other AI users.]

                [I don't know why this cumbersome forum software has started inserting the function code in its "quick & easy" quotation formatting.]

                 

                JET

                • 5. Re: Files with many objects -- CS4 slows to the point of being unusable
                  jf_moyen Level 1

                  Thanks for taking time to consider my problem; as I'm pointing out below however, it may appear that we're on wrong tracks here. I do not so much require help at processing graphs (as a matter of fact, most of what you mention I knew already how to do); rather, the issue is that the file I'm trying to edit, somehow, causes AI to slow down tremendously, at least on my system. That said, the misunderstanding is probably mine, as I guess some things obvious in scientific edition are not so in the world of graphism... and conversly.

                   


                  The problem sounds interesting. I don't speak the language of the other link, and couldn't find a button on the page to download the native AI file. But I downloaded the PDF and opened it in AI.

                   

                  Sorry about the link -- pardon my french :-) . I thought the site was supplying direct download links.

                   

                  I uploaded (somewhere else !) a couple of files corresponding to the project I'm working on, the pdf and the ai; should give you perhaps a more accurate impression :

                   

                  http://jfmoyen.free.fr/temp/500each_A0_1p.pdf
                  http://jfmoyen.free.fr/temp/500each.ai

                   

                  The file has less than 2000 paths. 222K on disk. That's not a complex file, and I don't see any performance problems at all with it.

                   

                  Yes, one single graph is Ok, the problem arises when dealing with many, as you should see with the couple of links above.

                   

                  Change font/size etc. for axis labels and scales

                   

                  That's easy: Marquee-select the whole row of labels across an axis, select a different size or other text attributes in the Characters Palette. Better (for recurring use) define Character Styles with all the desired formatting, and you can thereafter reformat the selected text objects with one click.

                   

                  That's easy, so is the rest. My isue is not "how to do these changes" -- this I know how to do. My issue is that when I open the pdf file (and even the ai file), Illustrator is so slow, that doing any thing at all is long and painful. Scrolling, selecting, etc. takes several secxonds each, a very noticeable lag, not justified by the size or the complxity of the file.

                   

                  (in fact in this specific case it's not beacause R generates a crazy pdf with text objects spanning two labels -- look at the Ti-V diagram, bottom row second from the right to see what I mean -- but anyway).

                   

                  (snip the next few items -- yes I know how to use layers and to select all according to stroke/fill using "select similar", and a few others as well)

                   

                  Illustrator has a rudimentary graphing toolset, and includes scatter plots. Not sure it can handle your data, or do everything you want to do with it, but I mention it in case you don't know.

                   

                  Too rudimentary indeed (more precisely it would require too much pre-processing of the data to suit my needs).

                   

                  Again, the file is not large or complicated. But in Illustrator, whenever you have a kazillion instances of the same object in the same document (your datapoint markers), it just makes good sense, if you can, to use Illustrator Symbols. Symbols are graphics stored in a library. Instances of a Symbol on the page are mere position/scale references to the single stored Symbol, so using them minimizes file size. This file is already quite small. But using Symbols would allow you to do things like select all instances at once, replace them with other Symbols, etc. (But not sure you even need that.) The problem would be that your markers are not single objects or Groups. So you would have to devise a means by which to isolate what constitutes a particular kind of marker; and that would probably require a scripting workaround.

                   

                  Never used symbols (... in > 10 years of Illustrator -- that's something  I love with AI, I pretty much learn new tricks every single day I use it !!), but it looks like a neat trick. However as you say, it's difficult to implement here with composite symbols; although, I could work upstream and generate a pdf using only the symbols I know R translates as single objects.

                   

                  How would I then change all objects (in one layer, say) to symbols ?

                   

                  However, I don't understand this part:

                   


                  me> I then "merge" all the pdf plots by "printing" them all on one page using "Adobe pdf"
                  me> printer in Acrobat.

                   


                  That implies to me that you are first creating an individual graph of each marker type.

                   

                  No, no, sorry, I explained poorly (again ? ). What I mean is that I generate 30 or so plots with the same dataset -- each colour coded according to data type. I then want to update all 30 graphs.

                   

                  R generates either 30 pdf's, or a 30-pages pdf with one graph on each page. As I'm trying to modify all of them in one go, I'm first putting all the graphs on a single page (the pdf file linked above) before starting the editing process. So far I found two ways :

                   

                  1) Open the pdf file at page 1, ctrl+A, ctrl+C, move to a blank file, ctrl+V, iterate for every page of the pdf;
                  2) "print" the 30-pages pdf on one single page (the pdf file linked above) and open it directly in AI.

                   

                  (1) is quite painful, as the lag when you reach page 20 or so becomes terrible, and moving objects at that stage to position them properly... *shrug* So I resorted to (2), and have that large, one-page pdf.

                   

                  If that's the case, it raises the question: With the program you're using to create one of the initial graphs, do you have a choice as to the marker used? For example, could it be a single-path of any shape? Could it be a simple single text character or number? If so, that would render trivial from a scripting perspective the problem of targeting all instances of a particular marker and replacement of them with a Symbol of your own design.


                   

                   

                   

                  I do have control on the markers -- to a point. In fact I can choose one in 15 or 20 options, rendered differently as you've seen (as text, single path or multiple path). But I could, certainly, choose the type of symbols that generates objects easier to post-process, if I can subsequently replace them by adequate symbols. On the other hand I can not use arbitrary symbols, as far as I know.

                  Taking that one step further, though: If the initial data you are working with exists as, or can be represented as ordinary tab-delimited data, as in a spreadsheet,

                   

                  As the output of a postgreSQL query in actual fact, but yes it's a csv or tab-txt, or pretty much anything I want it to be if you factor in a bit of scripting. Although, the less the better.

                   

                  As far as Illustrator is concerned, scripting the entire creation of a graph like this directly from the raw data values--including the value scales, titles, legend, etc.,  would not be difficult,

                   

                   

                  An interesting thought; however it wouldn't fit the bill, as the plotting software I use does actually do a bit more than plotting (it also does some calculations such as plot ratios of two variables, or use log axes, or traingular plots, or diagrams using certain constants...). It would work for plain, simple binary plots though.

                   

                   

                  The one caveat I see relative to the three last thoughts is that orange blob in your sample. I assume that's a region surrounding the set of values in a particular range. That kind of thing isn't built into things like Illustrator's or FileMaker's basic graphing features. It could possibly be included in a scripted solution, given the repeating or user-defined parameters. I imaging such things are built into more robust dedicated math/graphing applications.

                   

                   

                  But that's alright, I don't really mind adding a few items manually on each graph -- as I did in the (new) attached example.

                   

                  Thanks again for your time and suggestions, they are thought-provoking even if we're drifting a bit away from my original query -- but then, perhaps the real answer is "rethink your workflow entirely", and I'm open to that sort of approach too !

                  • 6. Re: Files with many objects -- CS4 slows to the point of being unusable
                    mcdnlt

                    Does anyone here know how to rationalise the points/vectors in a PDF? Indeed how do you count them in the first place (Ai only?)