Regular Expression¶
There are two sets of regular expression rules:
- Basic Regular Expression (BRE)
- Extended Regular Expression (ERE)
BRE | ERE | |
---|---|---|
Escape the next character | \ | \ |
Match any single character except newline | . | . |
Bracket expresion | [] | [] |
Grouping | \(\) | () |
Alternation | \| | | |
Match 0 or more times | * | * |
Match 1 or more times | \{1,\} | + |
Match 1 or 0 times | \{0,1\} | ? |
Match exactly m times | \{m\} | {m} |
Match at least m but no more than n times | \{m,n\} | {m,n} |
Match the beginning of the string (when not in []) | ^ | ^ |
Match the end of the string (when not in []) | $ | $ |
The modern applications implement ERE unless explicitly specified (like Linux command grep)
Many applications also implement the following features on top of the above rules:
- Greedy vs non-greedy match (Python)
- Special sequences such as
\d
and\w
- Flags/modifiers to change the matching behavior (e.g. ASCII-only matching or ignore case)
- Non-capturing group, named group (Python)
- Lookahead assertion, negative lookahead assertion (Python)
- Lookbehind assertion, negative lookbehind assertion (Python)
- Conditioned pattern (
(?(condition)yes-pattern|no-pattern)
)