scanner
  • About Scanner
  • When to use it
  • Architecture
  • Getting Started
  • Playground Guide
    • Overview
    • Part 1: Search and Analysis
    • Part 2: Detection Rules
    • Wrapping Up
  • Log Data Sources
    • Overview
    • List
      • AWS
        • AWS Aurora
        • AWS CloudTrail
        • AWS CloudWatch
        • AWS ECS
        • AWS EKS
        • AWS GuardDuty
        • AWS Lambda
        • AWS Route53 Resolver
        • AWS VPC Flow
        • AWS VPC Transit Gateway Flow
        • AWS WAF
      • Cloudflare
        • Audit Logs
        • Firewall Events
        • HTTP Requests
        • Other Datasets
      • Crowdstrike
      • Custom via Fluentd
      • Fastly
      • GitHub
      • Jamf
      • Lacework
      • Osquery
      • OSSEC
      • Sophos
      • Sublime Security
      • Suricata
      • Syslog
      • Teleport
      • Windows Defender
      • Windows Sysmon
      • Zeek
  • Indexing Your Logs in S3
    • Linking AWS Accounts
      • Manual setup
        • AWS CloudShell
      • Infra-as-code
        • AWS CloudFormation
        • Terraform
        • Pulumi
    • Creating S3 Import Rules
      • Configuration - Basic
      • Configuration - Optional Transformations
      • Previewing Imports
      • Regular Expressions in Import Rules
  • Using Scanner
    • Query Syntax
    • Aggregation Functions
      • avg()
      • count()
      • countdistinct()
      • eval()
      • groupbycount()
      • max()
      • min()
      • percentile()
      • rename()
      • stats()
      • sum()
      • table()
      • var()
      • where()
    • Detection Rules
      • Event Sinks
      • Out-of-the-Box Detection Rules
      • MITRE Tags
    • API
      • Ad hoc queries
      • Detection Rules
      • Event Sinks
      • Validating YAML files
    • Built-in Indexes
      • _audit
    • Role-Based Access Control (RBAC)
    • Beta features
      • Scanner for Splunk
        • Getting Started
        • Using Scanner Search Commands
        • Dashboards
        • Creating Custom Content in Splunk Security Essentials
      • Scanner for Grafana
        • Getting Started
      • Jupyter Notebooks
        • Getting Started with Jupyter Notebooks
        • Scanner Notebooks on Github
      • Detection Rules as Code
        • Getting Started
        • Writing Detection Rules
        • CLI
        • Managing Synced Detection Rules
      • Detection Alert Formatting
        • Customizing PagerDuty Alerts
      • Scalar Functions and Operators
        • coalesce()
        • if()
        • arr.join()
        • math.abs()
        • math.round()
        • str.uriencode()
  • Single Sign On (SSO)
    • Overview
    • Okta
      • Okta Workforce
      • SAML
  • Self-Hosted Scanner
    • Overview
Powered by GitBook
On this page
  • Log event structure
  • Text queries
  • Column Queries
  • Number queries
  • Boolean queries
  • Additional Details

Was this helpful?

  1. Using Scanner

Query Syntax

Here is how you search through your log events

Log event structure

In Scanner, a log event is a collection of key-value pairs called fields. In a field, the key is always a string, and the value may be either a string or a number.

For example, a log event from application logs might look like this:

{
  "message": "INFO - Successfully added item. item_id=817343 shopping_cart_id=1842101",
  "elapsed_ms": 79,
  "status_code": 200,
  "kubernetes": {
    "container_name": "shopping_cart_api",
    "pod_name": "app-3"
  },
  "@scnr": {
    "context_fields": "container_name,pod_name"
  }
}

And the resulting Scanner log event would look like this:

message: "INFO - Successfully added item. item_id=817343 shopping_cart_id=1842101"
message.%kv.item_id: 817343
message.%kv.shopping_cart_id: 1842101
elapsed_ms: 79
status_code: 200
kubernetes.container_name: "shopping_cart_api"
kubernetes.pod_name: "app-3"
@scnr.context_fields: "container_name,pod_name"

Text queries

Type in free-form text to search for hits. By default, search is case insensitive for ASCII characters, so these match the same lines.

info successfully added
INFO Successfully added

By default, tokens are matched separately, so these match the same lines.

info successfully added
info added successfully
added and info and successfully

Search terms only match full tokens by default (see Token Boundaries). Wildcards (see below) can be used for subtoken matches.

Bare (unquoted) strings cannot include whitespace or any of the following characters: :()"'<>=|,~{}!#`. They also can't be any reserved keywords (see Reserved Keywords).

Use single-quotes ' if you need to match any of these characters (or if you need to match a reserved keyword).

'info - item not added'
'info - successfully added item and committed transaction'

Use double-quotes " for exact, case-sensitive matching.

"item_id=817343"
"INFO - Successfully added item"

Quoted strings support escape sequences. See Escape Sequences for Strings for a comprehensive list.

Use * for wildcard searches. You can use \* to match the actual asterisk character instead.

app-*
*@protonmail.com
'andrew j*son'
"This sentence contains an actual asterisk: \*"

Column Queries

Use column: value to search for a column that contains value.

message: info added
message: 'info - successfully added item'
message: "INFO - Successfully added item"
kubernetes.pod_name: app-*
email: *@protonmail.com
current_president: 'andrew j*son'

Like in simple text queries, the : operator only matches full tokens by default; see Token Boundaries. Wildcards * can be used for subtoken matching.

Use column = value to search for a column that is exactly value.

name = al
# matches: {name: "Al"}, {name: "al"}
# but NOT: {name: "Big Al"}

name = "Al"
# matches: {name: "Al"}
# but NOT: {name: "al"}, {name: "Big Al"}

email = "*@protonmail.co"
# matches: {email: "al@protonmail.co"}, {email: "rob@protonmail.co"}
# but NOT: {email: "jon@protonmail.com"}

Use column: * or column = * if you just want to check if a column exists at all.

Like with string values, bare (unquoted) column names cannot include whitespace or any of the following characters: :()"'<>=|,~{}!`. They also can't be any reserved keywords (see Reserved Keywords).

Use backticks `` to denote columns that contain spaces or other disallowed characters.

`cat breed` = "Domestic shorthair"

Quoted column names support escape sequences. See Escape Sequences for Strings for a comprehensive list.

Use * or ** as a wildcard in column names. * matches any character other than .[]; ** matches any character. You can use \* to match the actual asterisk character instead.

*name = "Jackson"
# matches: {fname: "Andrew", lname: "Jackson"}, {fname: "Jackson", lname: "Pollock"}
# but NOT: {name_first: "Janet", name_last: "Jackson"}

request.*.status = 500
# matches: {request.first_part.status: 200, request.second_part.status: 500}
# but NOT: {request.first_part.connection.status: 500}

request.**.status = 500
# matches: {request.first_part.status: 200, request.second_part.status: 500}
# matches: {request.first_part.connection.status: 500}

pet_kinds[*]: "fish"
# matches: {pet_kinds[0]: "cat", pet_kinds[1]: "dog", pet_kinds[2]: "fish"}
# but NOT: {pet_kinds[0].preferred_foods[0]: "fish"}

pet_kinds[**]: "fish"
# matches: {pet_kinds[0]: "cat", pet_kinds[1]: "dog", pet_kinds[2]: "fish"}
# matches: {pet_kinds[0].preferred_foods[0]: "fish"}

Number queries

If your log events have number fields, you can look for exact matches or inequalities.

elapsed_ms: 79
elapsed_ms = 79
elapsed_ms <= 100
elapsed_ms > 100

Boolean queries

Scanner supports boolean queries using and, or, and not. These are case-insensitive.

kubernetes.container_name: "shopping_cart_api" 
and elapsed_ms > 100 and elapsed_ms < 10000 
and not status_code >= 400

You can use parentheses to specify order of operations.

(message.%kv.item_id: 817343 or message.%kv.item_id: 25134) 
and elapsed_ms > 50

If parentheses aren't used, then not has highest precedence, then and, then or, so these two queries are identical.

elapsed_ms > 10 and not status_code >= 400 or message.%kv.item_id: 817343

(elapsed_ms > 10 and (not status_code >= 400)) or message.%kv.item_id: 817343

If omitted, the default operator is and; i.e. any two query terms without a boolean operator will be assumed to be using and, so the following two queries are identical.

kubernetes.container_name: "shopping_cart_api" and elapsed_ms > 100

kubernetes.container_name: "shopping_cart_api" elapsed_ms > 100

Boolean operators can be used inside of column filters for the : and = operators, in which case the column filter distributes. Hence, these queries are identical.

stdout: ("hello" and 'world')

stdout: "hello" and stdout: 'world'

Inside of a column filter, the default operator is or rather than and, so the following queries are identical.

message.%kv.item_id = (817343 or 25134 or 55535)

message.%kv.item_id = (817343 25134 55535)

message.%kv.item_id = 817343
or message.%kv.item_id = 25134
or message.%kv.item_id = 55535

Additional Details

Token Boundaries

A query match will always start and stop on a whole token, and will never start or stop in the middle of one. A token corresponds roughly to a word.

A wildcard * can span token boundaries.

  • al will match "Al Sharpton", but not "Walt Whitman", "Alan Turing", or "John Calvin".

  • al*n will match "Alan Turing" and "Albert Einstein", but not "Walt Whitman" or "John Calvin".

If you need full subtoken matching, a wildcard can be placed at the beginning or end of a match term.

  • al* will match "Al Sharpton" and "Alan Turing", but not "Walt Whitman" or "John Calvin".

  • *al*n* will match all of "Alan Turing", "Albert Einstein", "Walt Whitman", and "John Calvin".

Note that search terms with prefix wildcards may be slower to run.

Escape Sequences for Strings

You can use escape sequences for certain characters. These work in all strings, including column name strings.

Escape sequence
Character

\"

double quote "

\'

single quote '

\`

backtick `

\*

asterisk *

\\

backslash \

\/

forward slash /

\b

backspace U+0008

\f

form feed U+000C

\n

line feed U+000A

\r

carriage return U+000D

\t

horizontal tab U+0009

\uXXXX

unicode character U+XXXX

Reserved Keywords

The following keywords are reserved in filters: and, or, not, let. Use quotes if you need to search for them as strings, and backticks if you need to use them as column names.

PreviousRegular Expressions in Import RulesNextAggregation Functions

Last updated 14 days ago

Was this helpful?