After using some grep recipes the results are poor: problems related with numerals like c/d or something like some words in other languages have parallel meaning (mil is 1000 in Spanish) I suppose (?0 that grepping the 21 roman numbers required in some books for centuries could be enough.
Could be processed as a grep style or just in the f/change window?
Making queries is a problems as each query is related to a couple specific styles and do not know how to adapt them).
How to grep: I, II, III, IV, etc?
"mil" is also a valid Roman number, so how would a GREP style see the difference? (One way could be the observation that your example list of Roman numbers only contain uppercase...)
I understand that your primary use would be something like "the XIX century" -- What would your proposed GREP style do? Make the Roman number small caps? In any case, the key word here is "century", so if you find a Roman number followed by that word, you can have your GREP style do something with it. All -- and I mean ALL -- other cases then must be handled manually.
As for the match ... hmmm ... writing out the twenty-one numbers is boring (and yet doable). Maybe something like
... I'm not sure I'm getting all of 1..21 that way, but it'sa start.
Thank you for your time!
Yes, it is a very good beginning. Implore you to check above as the variations of your formulae may be infinite for a plain person.
Not always century/centuries is a path. Author could say: in the XIX and XX bla bla. Always roman come from Word as caps or Scaps and are easily tagged.
Mil (M) is a nice exception. But is not determinant. The custom is that roman numerals (at least in my daily work) usually involve centuries and numerals until C (in straight social sciences bibliographies, vol. LXI is almost unseen). Although a good limit should be C.
I believe that a very simple grep (as the formula seems complicated) that involves all I-XXI (or I-L, or I-C) is not a problem for the processor. I have been using grep styles against the idea proclaimed here that it spins the machine.
Here is a list of romans just for information.
i, ii, iii, iv, v, vi, vii, viii, ix, x, xi, xii, xiii, xiv, xv, xvi, xvii, xviii, xix, xx, xxi, xxii, xxiii, xxiv, xxv, xxvi, xxvii, xxviii, xxix, xxx, xxxi, xxxii, xxxiii, xxxiv, xxxv, xxxvi, xxxvii, xxxix, xxxviii, xl, xli, xxix, xliii, liv, xlv, xlvi, xlvii, xlviii, xlix, l, li, lii, liii, liv, lv, lvi, lvii, lviii, lix, lx, lxi, lxii, lxiii, lxiv, lxv, lxvi, lxvii, lxviii, lxix, lxx, lxxi, lxxii, lxxiii, lxxiv, lxxv, lxxvi, lxxvii, lxxviii, lxxix, lxxx, lxxxi, lxxxii, lxxxiii, lxxxiv, lxxxv, lxxxvi, lxxxvii, lxxxviii, lxxxix, xc, xci, xcii, xciii, xciv, xcv, xcvi, xcvii, xcviii, xcix, c.
This kinda works - I'm sure Jongware can pick holes in it
Problem is though, it won't select the "c" at the end of the list?
If I remove the \b part it work just fine. Is there another way to encase an entire word?
It certainly won't work in every situation though. There are loads of instances where a "c" or "v" or even an "x" can be used in a standalone context, or together within a word.
Eugene, it seem a smart-neat solution.
Only the five main letters (I V X L C) are missed.
Of course it has to be a minor problem for gifted people but for me is a big solution.
In my books this grep has to work in body, two types of quotations, bibliography, footnotes, each one with a different character style.
It is very easy charge the five missing letters and apply them a color. Later, in the working progress they will appear and will be noticed.
(If the problem is bigger, simply f/change for all of them...)
Thank you. This approach worked finely as a grep style...
I use something like Eugene's approach to *search* for Romans, but I darenot use it in a GREP style (as your typical English text is riddled with single I's, and I need to catch possible single capital letters as well). If Eugene's seems to work for you so far, you can try this one:
One of the drawbacks I've found with automating this is when marking page numbers. "p. iv" works, as does "p. xlvii" -- but with an added "c" for hundreds it also locates "p.c.", which is a common shorthand for "personal communication"...
The Midas Touch!
It is a superb grep.
Just a question, the missing pronoun I is just that you are providing a safe for the first person in E?
In Spanish it is only a roman number and is not a harm thing to get it.
Or this may be integred for a one hundred percent of success here?
More interesting exception that can be detected in the process: m (meters), cm (centimeters).
Perhaps with another grep that ones (a dozen) could be considered in the exceptions list.
Thanks for this gifted piece of cake and talent.
Ah -- not behind my computer (looking at an iPad here ;) ) but can you test if my GREP works if you add a single space before the i in your test text?
Actually, I think this should NOT make it work ... theoretically! The \b code is a "word break", and the start of a paragraph *ought* to match that.
Yes, you were right. The space catches the element starting a paragraph.
Now we may extend this solution to all the roman numerals starting a paragraph.
(something unusual but possible)
But this last one enclosed above must be considered with prudence: it is catching the first letter in «lettered» paragraphs c) d).
Checking the manuscript will give the path to the best GREP.
It is very good: we have two GREPS with options for additional control
It worked perfect.