Comment on page

About Scanner

Find threats that are hiding in your security data lake in S3.

Need help getting set up?

  • For every customer, we create a private Slack channel where we invite you to ask us questions anytime and send us product feedback.
  • To get started, you can email our founders at [email protected], or sign up on the website at

Why use Scanner?

  • Find threats that are hiding in your security data lake Many log tools retain only a few weeks of logs, providing little visibility into historical data. Scanner analyzes the data in your security data lake in S3 and stores skip-list index files in your S3 bucket. You can run full-text queries with statistical aggregations and set up detection rules (coming soon) on your data lake.
  • Fast search for petabyte-scale log data sets in S3 When you execute a query, Scanner launches serverless Lambda functions to traverse its skip-list index files at high speed. Searching for a needle-in-haystack (eg. IP address, email address, UUID) over one petabyte of logs takes tens of seconds, not tens of hours. Even if your query returns many results, Scanner is fast - it can scan through index files at speeds of up to one terabyte per second. Creating the index files is also fast - Scanner can index up to 1 PB of logs per day. The index files created by Scanner compress well and are usually ~20x smaller than the original data.
  • Analyze logs in any format - no schema required Scanner can analyze S3 log files stored in JSON, Parquet, CSV, or plaintext format. Scanner is schemaless, so there is no need to create or maintain a schema to search through your data. Scanner will automatically parse your logs, and it will also extract data from any JSON strings or key-value pair strings (eg. src_ip= that it encounters in your data. All of the fields it encounters are indexed and are fast to search.
  • Easy onboarding, zero-cost data transfer When you sign up, we will launch an instance of Scanner in a brand new AWS account in your region. Then, you simply use CloudFormation, Terraform, or Pulumi to give the Scanner instance permission to read the S3 bucket(s) you want to index. Since the Scanner instance uses a VPC endpoint to interact with your S3 buckets in the same region, data transfer cost is zero - no need to ship logs over the public internet.
  • Work with a trustworthy partner Scanner maintains all of its data in S3 buckets in your AWS account, allowing you to control all of your log data. Scanner has completed SOC 2 Type I and Type II audits.

How to get started

  • Choose an AWS region. Let us know which AWS region your S3 buckets are in. We will deploy an instance of Scanner to a unique AWS account for your team in that region. Contact us at [email protected] to get started, or sign up at You can learn more about the architecture by reading the Scanner Architecture docs.
  • Run our CloudFormation, Terraform, or Pulumi template. To integrate with your Scanner instance, run our CloudFormation template to give the instance permission to read your S3 buckets. It will also create a new S3 bucket where Scanner's skip-list index files will be stored and an SQS queue to receive notifications whenever a new object is written to your buckets. Terraform and Pulumi templates are also available. You can learn more by reading the S3 Integration docs.
  • In Scanner, choose the S3 bucket(s) and keys to index. Log in to and select the S3 bucket(s) you want to index, along with the key prefix paths (and optional regex patterns) to filter the log files you want Scanner to index. Scanner supports JSON, Parquet, CSV, and plaintext log files. You can learn more by reading the Selecting Files to Index docs.
  • Start querying. Scanner will rapidly index your historical log files as well as brand new log files written to your S3 bucket(s). Log in to and start running investigation queries right away. To learn more about using Scanner, view the Query Syntax docs.
  • Set up detection rules (Coming soon). Configure detection rules to look for log events matching particular criteria over a time period. If a threshold is exceeded, Scanner can send notifications to Slack, PagerDuty, and/or custom webhooks.