I would reproduce that by placing each individual element on a separate layer and animating the position. This is easier and more efficient than laying out the whole project on a giant art board and then trying to move the whole thing around. Typically I would create the artwork in Illustrator with all of the elements that I want to animate on it's own layer and the artwork positioned in it's hero position. I start with a script, break the audio track down into phrases or no more than a couple of sentences, then create the artwork a layer at a time turning the bottom layers off as I move towards the top. A sentence or phrase is all that I put in a single AI file, then the AI file is imported as a comp. You could do the same thing with a layered Photoshop document, or, depending on the style, you could use shape layers. Inside AE you would make a separate comp for each phrase or sentence. If I were to reproduce your sample video I would create 15 or 20 separate illustrations, one for each change in the style or message, and create a composition for each illustration. One or two sentences to a comp. Then all of these scenes or shots (comps) would be edited in a NLE, where I would fine tune the edit and the pacing of the audio track. I now this sounds like more work, but in the long run, and trust me on this because I've done hundreds of this kind of video, it is actually a lot less work and it goes faster and you get a project that can be polished or changed because of client feedback, and you get a better product than trying to do the entire video in a single comp.
Just to repeat myself a bit, this scene:
A circle with words, an arrow, another circle with a word, another arrow, Another arrow, another circle with words, another arrow and a final circle with words would be a single piece of artwork. Since the Arrows and circles are all going to be animated on like they are being drawn I would use brush strokes go give that hand drawn look but I would also include a copy of each shape that had a 1 point stroke, no brush applied, that could be converted to a shape layer and animated using Trim Paths and used as a track matte to reveal the drawing. My layers would look like this: Circle 1 Matte. Circle 1, Arrow 1 matte, Arrow 1, Circle 2 matte, Circle 2, Arrow 2 matte, Arrow 2, Circle 3 matte, Circle 3, Arrow 3 matte, Arrow 3, Circle 4 matte, Circle 4, Arrow 4 matte, Arrow 4, Circle 5 matte, Circle 5. All of the layers stacked up on top of each other. When I opened up the comp in AE I would convert all matte layers to a shape layer with a wide enough stroke to hide the artwork, add Trim Paths to the matte layers and set up the track matte to reveal the drawing. Then the text layers would be added and animated. The Matte layers would be parented to their art work and the text layers would also be parented to their circles. When I had all of the artwork animating on the way I wanted it I would pre-compose the groups of circles and text layers and the group of arrow and arrow matte layers and then animate the position of each of the pre-comps to get the movement I wanted on the screen. I'm guessing that seven second sequence would take me about 10 minutes draw and 15 to 20 minutes to animate.
The process would be repeated for every scene or sentence or phrase in the project. The longest I would go would be a short paragraph because the comps just get too many layers in them and it becomes way too difficult to adjust and perfect the timing.
I hope this helps.
ah alright thanks, i think i get it
unfortunately i dont know how to use illustrator but maybe ill learn