My first question variation got no response, so I'm thinking my question is bigger than I thought. SO... here's one part of the question in simplified form.
Context for my question: we have a large quantity of documents that are reasonably well formatted using para/char styles and few overrides. My job is to add XML structure to the document. I've written some nice scripts that translate styles into tags and such, but I have some richer auto-tagging to do and have no idea how to get from here to there. I suspect that some kind of range management and/or grep-based search might help, but as a noob I don't know how to script that, have not found any examples after hours of searching/reading, and would love a few hints to get me going.
My first-step question (I think) is how to script a (grep?) that will select a range based on paragraph styles (or tags if that helps -- making tags from styles is now easy for me). Here's what I want to do, using a (fake) exmaple...
Paragraph styles are in ## (or you could think of them as tags if that is easier/works better...)
#header#Topic A Header#/header#
#Content1#Intro: Day 1 topic#/Content1#
#ContentTricky#Blah blah blah#/ContentTricky#
#headerX#Topic B Header#/headerX#
#ContentNormal#This section has a slightly different header style, as you can see#/ContentNormal#
Two kinds of selection I need to do:
1) Select content from one para style up to but not including the next para of the same style (eg from #subhead# up to but not including the next #subhead#)
2) Select content AFTER one para style (eg AFTER #header#) up to but not including the next matching one
and finally a wildcard version:
3) Same as #2 but up-to any para whose style begins with "header" ie header* (of course I have access to the whole list of styles, so if necessary I could search for each of several styles and choose the one that comes first...)
I'm assuming I do some kind of "Find para style X" then somehow adjust the start point to the spot right after that para... then do the same for the end point of the range... but this is quickly beyond me.
Thanks MUCH for any hints anybody can provide. Pointers to documents or tutorials would be fine. I just haven't found any examples that address these kinds of things.