Regular Expressions in Import Rules
Last updated
Was this helpful?
Last updated
Was this helpful?
Regular expressions used in import rules support the following standard syntax:
.
any non-newline character
(a|z)
a
or z
^
start of line
[az]
a
or z
$
end of line
[^az]
not a
or z
\b
word boundary
[a-z]
a
through z
\B
non-word boundary
(foo)
capture foo
\A
start of subject (usually the same as ^
)
a?
0 or 1 a
s
\z
end of subject (usually the same as $
)
a*
0 or more a
s
\d
decimal digit
a+
1 or more a
s
\D
non-decimal digit
a{3}
exactly 3 a
s
\s
whitespace
a{3,}
3 or more a
s
\S
non-whitespace
a{3,5}
between 3 and 5 a
s (inclusive)
\w
word character
\W
non-word character
All regular expressions are case-sensitive and unicode-aware, e.g. \s
will match unicode whitespace characters as well as ASCII ones.
Certain features of regular expressions aren't supported when they're used in . These are, specifically:
Lookarounds (i.e. lookahead and lookbehind), both negative and positive.
Positive lookaround can usually be matched directly instead. E.g. foo(?=bar)
could just be matched as foobar
.
Negative lookaround can usually be matched as a normal regex, but it can be tricky.
E.g. pre_(?!no)/
can be matched as pre_([^/]?|[^n/][^/]|[^/][^o/]|[^/]{3,})/
.
Because complex regexes like this are hard to maintain, we recommend just positive-matching the specific known items instead, e.g. pre_(yes|yeah|sure)
.
Backreferences.
Due to the nature of backreferences (i.e. that they are non-regular), it isn't generally possible to replicate the same match without them.
When possible, we recommend just enumerating all the items in this case instead, e.g. instead of trying to match all foldersfoob(a+)rb\0z/
, you can just enumerate the folders you know exist, like foo(barbaz|baarbaaz)/
.
These features are unsupported due to allowing for construction of that are hard for Scanner to detect.