Regular Expressions in Import Rules
Regular expressions used in import rules support the following standard syntax:
.
any non-newline character
(a|z)
a
or z
^
start of line
[az]
a
or z
$
end of line
[^az]
not a
or z
\b
word boundary
[a-z]
a
through z
\B
non-word boundary
(foo)
capture foo
\A
start of subject (usually the same as ^
)
a?
0 or 1 a
s
\z
end of subject (usually the same as $
)
a*
0 or more a
s
\d
decimal digit
a+
1 or more a
s
\D
non-decimal digit
a{3}
exactly 3 a
s
\s
whitespace
a{3,}
3 or more a
s
\S
non-whitespace
a{3,5}
between 3 and 5 a
s (inclusive)
\w
word character
\W
non-word character
All regular expressions are case-sensitive and unicode-aware, e.g. \s
will match unicode whitespace characters as well as ASCII ones.
Limitations
Certain features of regular expressions aren't supported when they're used in Import Rules. These are, specifically:
Lookarounds (i.e. lookahead and lookbehind), both negative and positive.
Positive lookaround can usually be matched directly instead. E.g.
foo(?=bar)
could just be matched asfoobar
.Negative lookaround can usually be matched as a normal regex, but it can be tricky.
E.g.
pre_(?!no)/
can be matched aspre_([^/]?|[^n/][^/]|[^/][^o/]|[^/]{3,})/
.Because complex regexes like this are hard to maintain, we recommend just positive-matching the specific known items instead, e.g.
pre_(yes|yeah|sure)
.
Backreferences.
Due to the nature of backreferences (i.e. that they are non-regular), it isn't generally possible to replicate the same match without them.
When possible, we recommend just enumerating all the items in this case instead, e.g. instead of trying to match all folders
foob(a+)rb\0z/
, you can just enumerate the folders you know exist, likefoo(barbaz|baarbaaz)/
.
These features are unsupported due to allowing for construction of extremely slow (exponential-time) regexes that are hard for Scanner to detect.
Last updated