Some of this has to do with how users have applied styles in
WYSIWYG mode; for example, if you double click a word (which
selects the space after) and apply Bold, then later select the word
by dragging the cursor to select only the word (which doesn't
select the space after), RH will usually surround the selected
characters with "normal" span tags and often leave the bold in
place.
This same type of style formatting might have been repeatedly
done in the original form that might have been imported (Word,
Frame, etc.). So, then, if you compound the code bloat from an
imported file with additional formatting in RH, well you do the
math.
I've seen code wherein a six-word sentence might be saddled
with as many as 10-12 SPAN tags and might stretch into 6-7 lines of
code!
However, these are still flat files, not binary; you'd need a
ton of extra code to increase the aggregate file size a lot. Heck,
your graphics will usually take more room than your topics. One of
our child projects has 153 .htm files at 3.25 MB, whereas the
aggregate of .gif, .jpg, and .bmp graphics exceed 4 MB. My view has
always been this: until the underlying code affects the format I
expect to see in my output, I consider extra SPAN and KADOV tags to
be nothing but white noise.
Good luck,
Leon