Re: Sugar for regular expression groupings.

Guido.van.Rossum@cwi.nl
Tue, 23 Feb 1993 02:03:23 +0100

>
>| Improvement 2:
>| Add syntax to regular expressions so that groups can be named
>| in place, yielding the group dictionary. (This is a *big*
>| advantage over perl.)
>
>I stronly second this improvement. How often do I just guess which
>subexpression yields what result? Often, since I always have trouble
>counting nested subexpressions (and I'm sure I'm not the only one).

I'm afraid the trouble with this one is that the syntax of Python
regular expressions is defined by the GNU Emacs regular expression
package. I am using a Finnish reimplementation that is free of the
GNU copyleft, but which follows the GNU syntax and interface quite
precisely so it is possible to plug in the GNU Emacs code instead
(which was slightly faster and a lot smaller if I remember it well).

I can't say I understand this code and I would like not to modify it.
So what can I do? Is it really that hard to count occurrences of \(?

>| But what python really needs are LALR(1) parser objects, don't you
>| think?
>
>Yes, that would solve a lot of other problems I encountered while
>parsing *long* strings.

That's an interesting thought. Unfortunately the only LALR(1) parser
generators I know of are Yacc and Bison, and neither appears to be
easily modified to generate its tables in a form different that C data
structures (anyone here to contradict this statement?). Would you
folks settle for a recursive descent parser generator (like the one
used to build the Python parser)? That one I know how to hack...

'Night,

--Guido van Rossum, CWI, Amsterdam <Guido.van.Rossum@cwi.nl>