Regular Expression¶
There are two sets of regular expression rules:
- Basic Regular Expression (BRE)
- Extended Regular Expression (ERE)
| BRE | ERE | |
|---|---|---|
| Escape the next character | \ | \ |
| Match any single character except newline | . | . |
| Bracket expresion | [] | [] |
| Grouping | \(\) | () |
| Alternation | \| | | |
| Match 0 or more times | * | * |
| Match 1 or more times | \{1,\} | + |
| Match 1 or 0 times | \{0,1\} | ? |
| Match exactly m times | \{m\} | {m} |
| Match at least m but no more than n times | \{m,n\} | {m,n} |
| Match the beginning of the string (when not in []) | ^ | ^ |
| Match the end of the string (when not in []) | $ | $ |
The modern applications implement ERE unless explicitly specified (like Linux command grep)
Many applications also implement the following features on top of the above rules:
- Greedy vs non-greedy match (Python)
- Special sequences such as
\dand\w - Flags/modifiers to change the matching behavior (e.g. ASCII-only matching or ignore case)
- Non-capturing group, named group (Python)
- Lookahead assertion, negative lookahead assertion (Python)
- Lookbehind assertion, negative lookbehind assertion (Python)
- Conditioned pattern (
(?(condition)yes-pattern|no-pattern))