Skip navigation
aleximac1
Currently Being Moderated

How to remove random extra spaces and auto format book?

Jun 29, 2013 8:56 PM

Tags: #font #text #format #layout #type #book #page

I have a ~500 page book I'm formatting in inDesign. I autoflowed all the text in, but unfortunately in the original RTF document there were many extra spaces and returns (enter key) where there shouldn't be. Basically, if the text following the previous paragraph is not indented, I need to delete any extra returns.

 

ANY HELP IS MUCH APPRECIATED

 

EXAMPLE:

 

LOOKS LIKE:

 

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam eu semper massa,

sed rhoncus eros. Nulla et tellus eget arcu fermentum porta. Mauris

lacinia, augue in scelerisque porttitor, nunc purus consectetur urna, vel sollicitudin mauris erat eu ante.

 

 

    

          Integer dignissim urna massa, vel facilisis lorem ornare vitae. Maecenas imperdiet purus justo. Aenean tempus vehicula leo malesuada egestas. Sed in ipsum eu mi consequat imperdiet. Suspendisse convallis rutrum massa ac faucibus. Maecenas felis velit, rhoncus nec imperdiet et, vestibulum et enim.

Nunc orci orci, aliquet sed dictum in, aliquam vitae tortor. Proin tempus mauris quam, sit amet ornare massa bibendum non. Vivamus eget condimentum magna. Phasellus risus sapien, molestie id vulputate vel, condimentum vel magna. Quisque in elit arcu. Mauris consequat placerat feugiat. Curabitur cursus nulla libero, eu mattis ante dictum suscipi.

 

WHAT I NEED IT TO LOOK LIKE:

 

     Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam eu semper massa, sed rhoncus eros. Nulla et tellus eget arcu fermentum porta. Mauris lacinia, augue in scelerisque porttitor, nunc purus consectetur urna, vel sollicitudin mauris erat eu ante.

 

     Integer dignissim urna massa, vel facilisis lorem ornare vitae. Maecenas imperdiet purus justo. Aenean tempus vehicula leo malesuada egestas. Sed in ipsum eu mi consequat imperdiet. Suspendisse convallis rutrum massa ac faucibus. Maecenas felis velit, rhoncus nec imperdiet et, vestibulum et enim.

Nunc orci orci, aliquet sed dictum in, aliquam vitae tortor. Proin tempus mauris quam, sit amet ornare massa bibendum non. Vivamus eget condimentum magna. Phasellus risus sapien, molestie id vulputate vel, condimentum vel magna. Quisque in elit arcu. Mauris consequat placerat feugiat. Curabitur cursus nulla libero, eu mattis ante dictum suscipi.


 
Replies
  • Currently Being Moderated
    Jun 29, 2013 10:46 PM   in reply to aleximac1

    Find and replace.

    In some cases also GREP. GREP also for multiple spaces.

     
    |
    Mark as:
  • Currently Being Moderated
    Jun 30, 2013 12:33 AM   in reply to aleximac1

    As I told you GREP, in find & replace are several presets, one of them is a multiple space replace preset. Use that.

     
    |
    Mark as:
  • Currently Being Moderated
    Jun 30, 2013 2:28 AM   in reply to aleximac1

    @Alex – for building a fundamental understanding what GREP is and using it  in InDesign, see:

     

    http://www.kahrel.plus.com/indesign/grep_matters.html

     

    Take a week off and study carefully…

     

    Uwe

     
    |
    Mark as:
  • Currently Being Moderated
    Jun 30, 2013 2:45 AM   in reply to Laubender

    And if one day you wonder what a "positive lookahead" is (in the context of GREP of course), see the following:

     

    http://carijansen.com/2013/03/03/positive-lookahead-grep-for-designers /

     

    GREP is one of the best features InDesign has. Period!


    Uwe

     
    |
    Mark as:
  • Currently Being Moderated
    Jun 30, 2013 3:02 AM   in reply to aleximac1

    There is a FindChangeByList sample script in the scripts panel that will remove multiple spaces and empty paragraphs so you shouldn't need to write your own GREP for most of this. It also changes multiple tabs to single tabs, though, so it isn't a good idea to use it if you have "tables" built from ordinary tabbed text (you'll get column shifts in rows that have a null entry in the middle); it also changes <space>-<space> to an en-dash and -- to an em-dash; and finally, it removes leading tabs at the start of a paragraph (the notion being that paragraphs should have first-line indent attributes rather than tabs), so you may need to do some other work before you run it.

     

    It would be extremely helpful if you showed us a screen shot of some sample page with both non-printing characters and the baseline grid showing, along with the font size/leading specs (make sure the cursor is active in the text someplace) so we can see exactly what is happening between lines (if there is an actual break character at the end of each line that needs to be removed, that's a lot more complex, for example, than changing the leading in the style or adjusting a grid). You can embed an image in your posts using the camera icon on the web page, like this:

    CameraIcon.png

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 1, 2013 2:15 AM   in reply to aleximac1

    @Alex – unfortunately you are not showing the none-printing characters like paragraph signs etc. Go to the Type menu and choose "Show Hidden Characters". Then redo your screen shots and post it here…

     

    Did you export the text from a PDF?

    A page number like "144" in the middle of your story indicates this…


    As a first step, I suggest, you have to get rid of those.

     

    GREP is all about patterns. So if you want to remove those, you have to figure out, if all the page numbers like "144" follow a distinct pattern.


    It could be like this:

     

    Assumption 1:
    All the page numbers stand alone in their own paragraphs (we cannot know that without showing "Hidden Characters" and a few more samples)

     

    We could phrase this pattern in GREP:

     

    ^\d+$
    

     

    That means: from the beginning of a paragraph ^ there is always a one- or multi-digit number \d (=one-digit, the plus sign means also multi-digit) and then comes the end of the paragraph indicated by the $ sign.

     

    Assumption 2:
    The number is always surrounded by two paragraph signs above and below (again we cannot know, because we have no insights of your document, but that is highly possible).
    If yes, we could look at the character before and after the catch we have in #1.

     

    For that we could use a "positive lookbehind" and a "positive lookahead" for exactly two paragraph signs:

     

    (?<=\r\r)^\d+$(?=\r\r)
    

     

    This is just an example. It all depends, if we could see a general pattern.

     

    Uwe

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points