19 Replies Latest reply on Dec 17, 2016 3:22 AM by Laubender

    Hidden Character not displaying?

    peterh79752694

      I have a document in Indesign (CC2015.4) that was placed from an MS word document. Apparently it originally was an html doc. but I was given it as a word document, and its going to end up as an .epub.

       

      When showing hidden characters I still have a blank space type character which contains no hidden character marker - it looks like a space except the space character before it displays as a small blue bullet. There are tons of them in the document and I would like to get rid of them as I suspect they are causing an issue with the epub that is exported (weird characters and spaces dispaying in iBooks). In MS word, if I select this "space" it shows the font as "MS Mincho" where all other characters are in Times New Roman. I can't figure out how to replace just this character or the use of this MS Mincho font (it wont copy/paste) in word, nor does it read as two space characters in find/replace. In Indesign I have long since set the paragraph style so I see no change in font for this character, but then I'm getting these weird outputs in the epub (see attached pic).

       

      If I select all in the MS word document and change the font, the font changes but all these "spaces" in MS Mincho do not change!

       

      Anyone know whats going on here, and if possible how to fix this?  Thanks

       

      ID PROBLEM.jpg

      Save

        • 1. Re: Hidden Character not displaying?
          Steve Werner Adobe Community Professional & MVP

          I would try open the Find/Change dialog (Cmd/Ctrl-F). In your document copy one of the hidden characters to the Clipboard.

           

          In the Text tab, click in the "Find what" field. Paste the character.

           

          Leave the "Change to" field blank. This should delete it.

           

          Search the document.

          • 2. Re: Hidden Character not displaying?
            peterh79752694 Level 1

            Yeah, I tried that but it doesn't copy or paste (although it seems to - ie that "character" can be selected in the find text box but doesnt seem to function - nothing is found/replaced - either in Word or ID, in fact not even in BB Edit / TextEdit as plain text!)

             

            I am trying to avoid having to reformat the document again from scratch but I guess I gotta just knuckle down and do it.

            • 3. Re: Hidden Character not displaying?
              Peter Spier Most Valuable Participant (Moderator)

              Just for fun, try again in Find/Change using the plain text tab rather than GREP, if you didn't before. There are some characters that don't seem to work in GREP.

              • 4. Re: Hidden Character not displaying?
                Steve Werner Adobe Community Professional & MVP

                You may have to remove it in the Word document by using Find/Change there.

                • 5. Re: Hidden Character not displaying?
                  Laubender Adobe Community Professional & MVP

                  Hi Peter,

                  you could look up the Unicode value of that character in the Info Panel.
                  Just select the character and open the Info Panel.

                   

                  Please read out its value and post back…

                   

                  Thanks,
                  Uwe

                  • 6. Re: Hidden Character not displaying?
                    jane-e Adobe Community Professional

                    After you select the odd character, right-click it and choose "Load Selected Glyph in Find"

                     

                     

                    This puts you into the Glyphs section of Find/Change

                    Repeat for the glyph you want to replace it with and load that one into the Change

                    In this example, I am searching for the space I selected above and replacing with the "T" to the right of it. You will use your problem glyphs.

                    I am doing it within the Story, it looks like. You have other choices there.

                     

                    • 7. Re: Hidden Character not displaying?
                      BobLevine MVP & Adobe Community Professional

                      Have you looked in the story editor? Have you tried to copy/paste into a plain text document?

                      • 8. Re: Hidden Character not displaying?
                        peterh79752694 Level 1

                        Hi everyone. Thanks for all the replies.

                         

                        I tried the "find glyph" and got this blank dialog. While the font was correctly identified as MS Mincho (missing) the glyph number in the bottom section was listed as 0020 (space) but in the top section it was blank. If I select the space character (immediately before that) I get the same result - blank above but 0020 below (just in the font Minion now as the "real" space is in that font!

                         

                        ID ISSUE.jpg

                         

                         

                         

                        Tried search replace in Word via copy/paste into find but no go. Nothing happens. Copied everything into Textedit and BB edit, made it all plain text and that character is still there! When I paste from those text files back to ID it comes with! Neither app recognises it and wont find or replace it. Apples built in glyph viewer app (Show Emoji and Symbols) doesnt recognise it being pasted into the text box.

                         

                        But - some progress: As Uwe said to do I opened the info pane - sure enough it displays a different unicode. Viz: 0x2028? Tried some more of them and yes, its the same unicode number on all of them. I googled that unicode and it lists as "line separator".

                         

                        So I guess now my question is - how would I search for and replace that particular glyph? Can you find/replace a unicode number? Nothing seems to activate in the glyph find replace dialog - the find/next etc buttons are dimmed. (Although TBH I've never used that option before!)

                         

                        Screen Shot 2016-09-08 at 12.35.22.pngScreen Shot 2016-09-08 at 12.50.03.png

                        Save

                        Save

                        • 9. Re: Hidden Character not displaying?
                          peterh79752694 Level 1

                          OKAY! Finally got it right - I first did a "Find Font" to get rid of that MS Mincho font and then when I copied and pasted the character into the Text Replace Find panel it worked! All gone.... Thanks for all the help, I was going crazy with this. And the side benefit is I could replace it with a ^p and saved a whole lot of time reformatting according to the paper reference!!

                           

                           

                          • 11. Re: Hidden Character not displaying?
                            Laubender Adobe Community Professional & MVP

                            peterh79752694 wrote:


                            But - some progress: As Uwe said to do I opened the info pane - sure enough it displays a different unicode. Viz: 0x2028? Tried some more of them and yes, its the same unicode number on all of them. I googled that unicode and it lists as "line separator".

                             

                            So I guess now my question is - how would I search for and replace that particular glyph? Can you find/replace a unicode number?

                            Hi Peter,

                            you can search by TEXT Find/Replace after <2028> and change with nothing.

                            Then that character is gone.

                             

                            From my German InDesign:

                             

                            Text-FindReplace-Unicode-2028.png

                             

                            Or also with GREP Find/Replace.

                             

                            Find expression:
                            \x{2028}

                             

                            Replace with nothing.

                            I hope, now the syntax for doing this with TEXT or GREP Search/Replace is clear.

                             

                            Regards,
                            Uwe

                            • 12. Re: Hidden Character not displaying?
                              peterh79752694 Level 1

                              Thanks Uwe. Definitely this frustrating problem was a learning opportunity in disguise, no matter how obscure!

                              • 13. Re: Hidden Character not displaying?
                                philippanmei Level 4

                                In Ms word

                                Press Ctrl+Shift+8 (Show and hide) see whether there's a extra spaces, hidden special character, symbol, anything extra. If so try to find/replace in ms word itself.. It will be very easy to do it.

                                 

                                 

                                Thanks

                                2 people found this helpful
                                • 14. Re: Hidden Character not displaying?
                                  Laubender Adobe Community Professional & MVP

                                  Hi Peter,

                                  did you find out what the purpose of "line separator" <2028> was in the orginal Word document?
                                  Or did a "end of line" formatting in the HTML document preceding the Word file had a purpose?

                                  Was <2028> perhaps thought as an intended "break point" in case the HTML element container changes its width?

                                   

                                  Or did InDesign import a different character as <2028> ?

                                   

                                  Another thing:
                                  In your screenshot from your initial post you are showing two distinct behaviors of not visible special characters.
                                  Was "line separator" <2028> found in both cases?

                                   

                                  iBooks is showing a * where initially there were three characters: <2028> followed by "I" followed by a blank.

                                  The other one was showing the HTML of an exported ePub? Then InDesign did perhaps a transformation of the second character after <2028>, that was "L", to: / .

                                   

                                  How can that be explained?

                                   

                                  Regards,
                                  Uwe

                                  • 15. Re: Hidden Character not displaying?
                                    jwyatt@bayard Level 1

                                    I just ran into this same issue with an addition. I also first caught the issue in an epub where what looked like a space in InDesign was omitting the space.

                                     

                                    In dreamweaver the space has a large red dot in the code (design view shows the space). And when I copied and pasted it into textedit or text Wrangler is turns into a line break.

                                     

                                    Thanks to the info panel suggestion above (face palm for not thinking of it myself) I discovered that it was unicode 2028 as well. Find/change worked for me in InDesign. AND it worked in Dreamweaver so it wasn't a big deal in the end.

                                     

                                    The bigger issue really may be, why is InDesign translating a code that should be a line break as a space?

                                    • 16. Re: Hidden Character not displaying?
                                      Laubender Adobe Community Professional & MVP

                                      jwyatt@bayard wrote:

                                      … The bigger issue really may be, why is InDesign translating a code that should be a line break as a space?

                                      Hi,

                                      just a note on this:

                                      It seems, that special character <2028> is supported correctly in the form &#x2028 with InDesign's XML import.

                                      After import to a text frame it will be translated to <000A>. Wheras &#xA will be translated to <000D>, the end of paragraph character.

                                       

                                      Example with XML and InDesign CS6 v8.1.0 on Mac OSX 10.7.5:

                                       

                                      SpecialCharacter-0028-with-XML-placed-in-text-frame.png

                                       

                                      Discussion around special characters and XML import here at the German language www.hilfdirselbst.ch:

                                      Codierung für Zeilenumbruch beim Import von XML - Adobe InDesign - HilfDirSelbst.ch - Forum

                                       

                                      Regards,
                                      Uwe

                                      1 person found this helpful
                                      • 17. Re: Hidden Character not displaying?
                                        Laubender Adobe Community Professional & MVP

                                        And after reading this:

                                        Newline - Wikipedia

                                         

                                        I think the problem is a complex one.
                                        Different programming libraries will interpret <2028> perhaps differently.
                                        ( At least the subject is not fully clear to me. ;-) )

                                         

                                        Based on this I wanted to show that a "simple" line break in Word will be translated correctly after imported with InDesign through the Word import filter. But I had mixed results with that depending on the format of the Word files doc vs docx .

                                         

                                        Word's doc format:

                                         

                                        LineBreak-Word-doc-format.png

                                         

                                        doc file imported as formatted text to InDesign is showing nothing special:

                                         

                                        LineBreak-Word-doc-format-imported-to-InDesign.png

                                         

                                        Wheras a docx file imported as formatted text is showing an <FEFF> special character in combination with <000A>:

                                         

                                        LineBreak-Word-docx-format.png

                                        LineBreak-Word-docx-format-imported-to-InDesign-1.png

                                        LineBreak-Word-docx-format-imported-to-InDesign-2.png

                                        And that's really bad.

                                        <FEFF> characters have several purposes within InDesign.

                                         

                                        They can denote (maybe also others):

                                         

                                        XML tags

                                        Note objects

                                        Index markers

                                        Bookmark markers

                                         

                                        And indeed with the docx import there came a bookmark object:

                                         

                                        LineBreak-Word-docx-format-imported-to-InDesign-3.png

                                         

                                        And that's not the fault of InDesign's import filter.

                                        The bookmark is there in the XML ( document.xml) that is packaged within the docx file.

                                         

                                        We can break open the docx file—under the hood it is an XML representation ( we can compare it to a similar structured container file like InDesign's own exchange format IDML ). We can inspect its document.xml with InDesign's XML Structure window.

                                         

                                        And we will see that the bookmark is already defined in the docx file.

                                        Don't know how this could happen. I simply pressed Shift Return when in Word.

                                         

                                        Word-docx-XML-Structure.png

                                         

                                         

                                        And it's no fun getting rid of the new imported special characters, if XML markers from a Word docx import get mixed up with deliberately set Bookmark objects in InDesign or: Note objects and Index markers that should stay intact wheras freshly imported <FEFF> special characters after importing docx files should be removed. I'v also seen stray <FEFF>s after datamerge sometimes.

                                         

                                        What I like to say is, it could well be, that if a text is travelling through a chain of apps in a workflow—HTML, XML, Word docx, InDesign—all strange things could happen. And not necessarily it is the fault of InDesign's import module.

                                         

                                        Regards,
                                        Uwe

                                        • 18. Re: Hidden Character not displaying?
                                          Laubender Adobe Community Professional & MVP

                                          Funny enough is, that the bookmark that sneaked in with Word's docx format is named _GoBack just sounds like the command to an old typewriter to go back to the start of a line in combination with a br command to break the line.

                                           

                                          A simple &#x2028 expression in the XML perhaps had done the right thing ;-)

                                           

                                          Regards,
                                          Uwe

                                          • 19. Re: Hidden Character not displaying?
                                            Laubender Adobe Community Professional & MVP

                                            And it seems that Word is interpreting the combination of a named _GoBack bookmark in combination with a br command as a simple line break. Perhaps that's an idiomatic Microsoft thing with XML?

                                             

                                            Perhaps the Word import filter of InDesign should be adapted to that ideomatic use of XML …

                                             

                                            Regards,
                                            Uwe