Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
That's a cute quote, but this is exactly the sort of thing that regular expressions *are* good for.
The only problem here is that they are a little tricky. Some regular expression match characters (namely, '.' in your case) are greedy, meaning they will "eat" everything that matches as long as the rest of the expression can be satisfied. The problem you're seeing is because the parenthetical group is matching everything after 'href="' up to the *last* 'html' (*not* the first one, as you are expecting). The simplest way to fix this is to be more conservative about the exact page names you're going to accept: if none of them have embedded periods (i.e., you do not have any links named "foo.bar.html"), you can just change your RE to
That is, page names start with a character in the range [A-Z] (though case insensitive, as you specify later) followed by zero or more "not a period" characters, followed by ".html" (note that you do have to escape the period).
Alternately, if Flex REs support lazy matching (which I *think* they do, although I'm not certain), you can just change the RE to
Meaning, page names start with [A-Z], then any number of characters, but first check if there is a trailing '.html">' before considering the match.