scanner
  • About Scanner
  • When to use it
  • Architecture
  • Getting Started
  • Playground Guide
    • Overview
    • Part 1: Search and Analysis
    • Part 2: Detection Rules
    • Wrapping Up
  • Log Data Sources
    • Overview
    • List
      • AWS
        • AWS Aurora
        • AWS CloudTrail
        • AWS CloudWatch
        • AWS ECS
        • AWS EKS
        • AWS GuardDuty
        • AWS Lambda
        • AWS Route53 Resolver
        • AWS VPC Flow
        • AWS VPC Transit Gateway Flow
        • AWS WAF
      • Cloudflare
        • Audit Logs
        • Firewall Events
        • HTTP Requests
        • Other Datasets
      • Crowdstrike
      • Custom via Fluentd
      • Fastly
      • GitHub
      • Jamf
      • Lacework
      • Osquery
      • OSSEC
      • Sophos
      • Sublime Security
      • Suricata
      • Syslog
      • Teleport
      • Windows Defender
      • Windows Sysmon
      • Zeek
  • Indexing Your Logs in S3
    • Linking AWS Accounts
      • Manual setup
        • AWS CloudShell
      • Infra-as-code
        • AWS CloudFormation
        • Terraform
        • Pulumi
    • Creating S3 Import Rules
      • Configuration - Basic
      • Configuration - Transformations
      • Previewing Imports
      • Regular Expressions in Import Rules
  • Using Scanner
    • Query Syntax
    • Aggregation Functions
      • avg()
      • count()
      • countdistinct()
      • eval()
      • groupbycount()
      • max()
      • min()
      • percentile()
      • rename()
      • stats()
      • sum()
      • table()
      • var()
      • where()
    • Detection Rules
      • Event Sinks
      • Out-of-the-Box Detection Rules
      • MITRE Tags
    • API
      • Ad hoc queries
      • Detection Rules
      • Event Sinks
      • Validating YAML files
    • Built-in Indexes
      • _audit
    • Role-Based Access Control (RBAC)
    • Beta features
      • Scanner for Splunk
        • Getting Started
        • Using Scanner Search Commands
        • Dashboards
        • Creating Custom Content in Splunk Security Essentials
      • Scanner for Grafana
        • Getting Started
      • Jupyter Notebooks
        • Getting Started with Jupyter Notebooks
        • Scanner Notebooks on Github
      • Detection Rules as Code
        • Getting Started
        • Writing Detection Rules
        • CLI
        • Managing Synced Detection Rules
      • Detection Alert Formatting
        • Customizing PagerDuty Alerts
      • Scalar Functions and Operators
        • coalesce()
        • if()
        • arr.join()
        • math.abs()
        • math.round()
        • str.uriencode()
  • Single Sign On (SSO)
    • Overview
    • Okta
      • Okta Workforce
      • SAML
  • Self-Hosted Scanner
    • Overview
Powered by GitBook
On this page

Was this helpful?

  1. Indexing Your Logs in S3
  2. Creating S3 Import Rules

Regular Expressions in Import Rules

Regular expressions used in import rules support the following standard syntax:

.

any non-newline character

(a|z)

a or z

^

start of line

[az]

a or z

$

end of line

[^az]

not a or z

\b

word boundary

[a-z]

a through z

\B

non-word boundary

(foo)

capture foo

\A

start of subject (usually the same as ^)

a?

0 or 1 as

\z

end of subject (usually the same as $)

a*

0 or more as

\d

decimal digit

a+

1 or more as

\D

non-decimal digit

a{3}

exactly 3 as

\s

whitespace

a{3,}

3 or more as

\S

non-whitespace

a{3,5}

between 3 and 5 as (inclusive)

\w

word character

\W

non-word character

All regular expressions are case-sensitive and unicode-aware, e.g. \s will match unicode whitespace characters as well as ASCII ones.

Limitations

Certain features of regular expressions aren't supported when they're used in Import Rules. These are, specifically:

  • Lookarounds (i.e. lookahead and lookbehind), both negative and positive.

    • Positive lookaround can usually be matched directly instead. E.g. foo(?=bar) could just be matched as foobar.

    • Negative lookaround can usually be matched as a normal regex, but it can be tricky.

      • E.g. pre_(?!no)/ can be matched as pre_([^/]?|[^n/][^/]|[^/][^o/]|[^/]{3,})/.

        • Because complex regexes like this are hard to maintain, we recommend just positive-matching the specific known items instead, e.g. pre_(yes|yeah|sure).

  • Backreferences.

    • Due to the nature of backreferences (i.e. that they are non-regular), it isn't generally possible to replicate the same match without them.

      • When possible, we recommend just enumerating all the items in this case instead, e.g. instead of trying to match all foldersfoob(a+)rb\0z/, you can just enumerate the folders you know exist, like foo(barbaz|baarbaaz)/.

These features are unsupported due to allowing for construction of extremely slow (exponential-time) regexes that are hard for Scanner to detect.

PreviousPreviewing ImportsNextQuery Syntax

Last updated 11 months ago

Was this helpful?