Pyli/Regular Expressions
Contents
Regular Expressions
Many languages support a regular expression syntax similar to Perl's. Perl's RE syntax, however, has long ago left a simple syntax by allowing arbitrary expressions as part of the regular expressions.
I plan on abandoning the special syntax altogether, and only allowing expressions.
The use case for regular expression is one or more of the following:
- Identify whether a string matches an expression.
- Identify what matches the expression, including sub-expressions.
- Split a string based on an expression.
- Provide a lexer.
The lexer is more fully discussed as part of Parsing. It is simply an iterator that returns successive matches (like 2).
Whether a string matches an expression
This is the most simple of all.
There are several functions that build matching functions. A matching function simply returns True or False given a particular string.
Individual Characters
These look only at the first character of the string and ignore the rest.
- (matches-exact <char>) : Matches an exact char
- (matches-one-of <string>) : Matches one of the characters in the string.
- (matches-any-but <string>) : Matches any character but one in the string.
- (matches-class <description>) : Given a character class description (space, non-space, letter, etc...)
Repeaters
These match a repeating match of one of the above. They look through the successive parts of the match.
- (matches-zero-or-once <matcher>) : Similar to '?'
- (matches-once-plus <matcher>) : Similar to '+'
- (matches-zero-plus <matcher>) : Similar to '*'
- (matches-range <matcher> <from> <to>) : Similze to {from,to}
The minimal variants will attempt to match the minimum number of times.
- (matches-zero-or-once-minimal <matcher>) : Similar to '??'
- (matches-once-plus-minimal <matcher>) : Similar to '+?'
- (matches-zero-plus-minimal <matcher>) : Similar to '*?'
- (matches-range-minimal <matcher> <from> <to>) : Similar to {from,to}?
Alternates
These look at matching one or the other expression.
- (matches-alternate <matcher> ...)