scanner
  • About Scanner
  • When to use it
  • Architecture
  • Getting Started
  • Playground Guide
    • Overview
    • Part 1: Search and Analysis
    • Part 2: Detection Rules
    • Wrapping Up
  • Log Data Sources
    • Overview
    • List
      • AWS
        • AWS Aurora
        • AWS CloudTrail
        • AWS CloudWatch
        • AWS ECS
        • AWS EKS
        • AWS GuardDuty
        • AWS Lambda
        • AWS Route53 Resolver
        • AWS VPC Flow
        • AWS VPC Transit Gateway Flow
        • AWS WAF
      • Cloudflare
        • Audit Logs
        • Firewall Events
        • HTTP Requests
        • Other Datasets
      • Crowdstrike
      • Custom via Fluentd
      • Fastly
      • GitHub
      • Jamf
      • Lacework
      • Osquery
      • OSSEC
      • Sophos
      • Sublime Security
      • Suricata
      • Syslog
      • Teleport
      • Windows Defender
      • Windows Sysmon
      • Zeek
  • Indexing Your Logs in S3
    • Linking AWS Accounts
      • Manual setup
        • AWS CloudShell
      • Infra-as-code
        • AWS CloudFormation
        • Terraform
        • Pulumi
    • Creating S3 Import Rules
      • Configuration - Basic
      • Configuration - Optional Transformations
      • Previewing Imports
      • Regular Expressions in Import Rules
  • Using Scanner
    • Query Syntax
    • Aggregation Functions
      • avg()
      • count()
      • countdistinct()
      • eval()
      • groupbycount()
      • max()
      • min()
      • percentile()
      • rename()
      • stats()
      • sum()
      • table()
      • var()
      • where()
    • Detection Rules
      • Event Sinks
      • Out-of-the-Box Detection Rules
      • MITRE Tags
    • API
      • Ad hoc queries
      • Detection Rules
      • Event Sinks
      • Validating YAML files
    • Built-in Indexes
      • _audit
    • Role-Based Access Control (RBAC)
    • Beta features
      • Scanner for Splunk
        • Getting Started
        • Using Scanner Search Commands
        • Dashboards
        • Creating Custom Content in Splunk Security Essentials
      • Scanner for Grafana
        • Getting Started
      • Jupyter Notebooks
        • Getting Started with Jupyter Notebooks
        • Scanner Notebooks on Github
      • Detection Rules as Code
        • Getting Started
        • Writing Detection Rules
        • CLI
        • Managing Synced Detection Rules
      • Detection Alert Formatting
        • Customizing PagerDuty Alerts
      • Scalar Functions and Operators
        • coalesce()
        • if()
        • arr.join()
        • math.abs()
        • math.round()
        • str.uriencode()
  • Single Sign On (SSO)
    • Overview
    • Okta
      • Okta Workforce
      • SAML
  • Self-Hosted Scanner
    • Overview
Powered by GitBook
On this page
  • Scanner is for builders
  • Unleash your high volume log sources
  • Build a modern security and observability stack - without blind spots
  • Fast search on large data sets
  • Eliminate data engineering work
  • Easy onboarding, zero-cost data transfer
  • Work with a trustworthy partner
  • How to get started
  • Onboard with Scanner's engineering team
  • Start querying
  • Set up detection rules

Was this helpful?

About Scanner

Fast ad-hoc search and threat detections for security logs. Ingest terabytes per day - without breaking the bank.

NextWhen to use it

Last updated 24 days ago

Was this helpful?

Scanner is for builders

Unleash your high volume log sources

Many security and observability teams store high volume log sources in S3 to keep log management costs low. However, once the logs are in S3, it can be difficult to get value out of them without significant data engineering projects, like maintaining ETL pipelines to transform semi-structured logs into Parquet files, reshaping data to conform to SQL table schemas, maintaining indexes and partitions in tools like Amazon Athena, and more. Even after all of this work, most tools that can search data in S3 often take minutes or hours to run a single query. As a result, valuable data in these log files in S3 becomes inaccessible.

Scanner fixes these problems by indexing logs in-place in your S3 buckets and by giving you a lightning fast search experience. You can build on top of Scanner's API for ad hoc search, time series querying, and threat detections - and you can jump into Scanner's powerful search UI for rapid investigations.

Build a modern security and observability stack - without blind spots

By using the API that Scanner provides on top of your logs in S3, you can build a modern security and observability stack at a fraction of the cost of other tools. For example, you can use or to write logs and traces into S3; use Scanner to power log search, time series, and threat detections on top of that data in S3; build dashboards in or powered by the Scanner API; and send threat detection events from Scanner to , , , , and custom webhooks.

Fast search on large data sets

When you execute a query, Scanner launches serverless Lambda functions to traverse its index files at high speed. Using data structures like string token posting lists and numerical ranges, the index files guide Scanner to the log regions that contain hits. Searching for a needle-in-haystack (eg. IP address, email address, or UUID) across 100TB of logs takes around 10 seconds; across 1PB of logs, around 100 seconds. Scanner queries can be 10-100x faster than other tools that scan S3, like Trino, Amazon Athena, or CloudWatch.

Eliminate data engineering work

Scanner is designed to be highly flexible. It indexes S3 log files in their original, semi-structured format in-place: specifically JSON, CSV, plaintext, or Parquet. This means you can eliminate many kinds of data engineering projects, like maintaining a log processing pipeline to transform logs to adhere to strict SQL table schemas. Scanner will automatically parse your logs, and it will also extract data from any JSON strings or key-value pair strings (eg. src_ip=123.45.67.89) that it encounters in your data. All fields are indexed - you can search on any field.

Easy onboarding, zero-cost data transfer

When you sign up, we will launch an instance of Scanner in a new, unique AWS account in your region. Then, you simply use CloudFormation, Terraform, or Pulumi to set up a few things in your AWS account:

  1. An IAM role and policy

  2. A new S3 bucket to store Scanner's index files

  3. An SNS topic for S3 bucket event notifications.

Since the Scanner instance uses a VPC endpoint to interact with your S3 buckets in the same region, data transfer cost is zero. This can be much cheaper than shipping logs over the internet to a third-party vendor.

Work with a trustworthy partner

Scanner maintains all of its data in S3 buckets in your AWS account, giving you complete control of all of your log data. Scanner has completed SOC 2 Type I and Type II audits.

How to get started

Onboard with Scanner's engineering team

  • Scanner will deploy a new Scanner instance to your AWS region.

  • You will run a CloudFormation, Terraform, or Pulumi template to create:

    • An IAM role that Scanner can assume

    • A new S3 bucket for Scanner index files

    • An SNS topic for S3 bucket notifications, which will relay events to Scanner's SQS queue.

  • The Scanner engineering team will send you email invitations to log in to your Scanner instance, and they will meet with you to walk you through the product.

Start querying

Set up detection rules

Configure detection rules to look for log events matching particular criteria over a time period. If the criteria you set have been met, you can configure Scanner to send notifications to Slack, Tines, Torq, Jira, or custom webhooks. For more information, view the Detection Rules docs.

To get started, sign up for a demo at . You'll meet with our engineers, who will chat with you to learn about your use cases and walk you through the process of how to get started:

Scanner will rapidly index your historical log files as well as brand new log files written to your S3 bucket(s). Log in to and start running queries. To learn more about using Scanner, view the docs.

Cribl
Vector
Grafana
Tableau
Slack
Tines
Torq
Jira
https://scanner.dev
https://app.scanner.dev
Query Syntax