%{month}/%{day}
generates the regular expression
\([0-9]+\)/\([0-9]+\)
and the dictionary:
{'month': 1, 'day': 2}
The dictionary maps field names to numeric arguments for the Python regex
module's group() function.
I have a somewhat more complicated specifier that works fine:
%{smonth}/%{sday}%{? - %{eday}}
The leading '?' says that particular chunk of the pattern is optional.
It generates
\([0-9]+\)/\([0-9]+\)\([ ]*-[ ]*\([0-9]+\)\)?
and
{'sday': 2, 'smonth': 1, 'eday': 4}
It successfully matches lines like
1/25-26
I have a multi-day string that crosses the end of a month:
4/30-5/1
so I built the following pattern:
%{smonth}/%{sday}%{? - %{?%{emonth}/}%{eday}}
which generated
\([0-9]+\)/\([0-9]+\)\([ ]*-[ ]*\(\([0-9]+\)/\)?\([0-9]+\)\)?
and
{'smonth': 1, 'sday': 2, 'emonth': 5, 'eday': 6}
just as I expected. It works fine for date strings of the form 1/25 or
4/30-5/1, but returns incorrect results for dates of the form 1/25-26. The
return value of group(5) is '26' instead of None. This is especially
perplexing since group(4), which encloses group(5) correctly returns None.
(For those with acute regexp-itis group(4) and group(6) are nested inside
group(3). group(5) is nested inside group(4). Both group(3) and group(4)
are optional. I saw nothing in the Emacs regexp syntax info page that would
suggest optional regexps should not be nested within one another.)
I noticed that the version of Tatu Ylonen's regexpr.c code used in Python
seemed to not be the most recent, so I fetched the version that was posted
to comp.sources.misc (in volume 27) and the one patch for it I found (in
volume 29), merged Guido's changes into them and rebuilt Python (1.1.1) but
saw no improvement.
Can anybody steer me in the right direction? Have I
a. overstepped the bounds of regular expressions (nesting multiple
optional regexps, prehaps)?
b. failed in my understanding of how they work?
c. generated a faulty regular expression?
d. found a bug in regexpr.c?
e. some, all or none of the above? :-)
Thanks,
-- Skip Montanaro skip@automatrix.com (518)372-5583 Automatrix - World-Wide Computing Solutions http://www.automatrix.com/