    Compare InDesign file with ePub


      I have a 500-page book in InDesign and I created an ePub out of it. Is there a way I can compare the text of the ePub file with the original InDesign file, like you can compare documents in MS Word to see what the differences are. This way I could easily track down obvious errors in the ePub file, like an omitted sentence, which would help the proofreading of the ePub file. If InDesign doesn't have this functionality perpaps someone knows about a plugin or an third party who could provide this. 

          Ellis home Level 4

          That's an interesting proposition. Never thought about it. I did a 500 page book a while ago when I just started using Indesign but I just reviewed the epub. The main issue was where I had adjusted manually some lines for the printed edition to make the bottom of pages even. The epub would show odd spaces. Then I learned you could choose the "remove forced line breaks" option. Didn't have a case where a sentence was omitted. It would be interesting to know if there's a plug in for it.

            Joel Cherney Level 5

            I'd save both out as raw text and do a diff from the command line, but I'm old-fashioned that way . If you've never cracked open an epub and looked at its innards, now is a great time! It's basically html and css in there. So the data format is radically different from InDesign, or from IDML for that matter. So what you're asking for is not trivial at all.


            Are you worried about artifacts of the conversion process, or of human error? If human error, I'd suggest saving raw text out of both and comparing with Your Preferred Document Comparison Tool of Choice.

              JelmerLA Level 1

              Hi Joel,

              I like your way of thinking. The book I'm doing has about 50 chapters (meaning 50 HTML files in the ePub file and 50 inDesign files), do you know of a plugin or script which would merge the 50 files into one file so I don't need to do 50 comparisons. Merging files in inDesign itself is a time consuming process.