scanner
  • About Scanner
  • When to use it
  • Architecture
  • Getting Started
  • Playground Guide
    • Overview
    • Part 1: Search and Analysis
    • Part 2: Detection Rules
    • Wrapping Up
  • Log Data Sources
    • Overview
    • List
      • AWS
        • AWS Aurora
        • AWS CloudTrail
        • AWS CloudWatch
        • AWS ECS
        • AWS EKS
        • AWS GuardDuty
        • AWS Lambda
        • AWS Route53 Resolver
        • AWS VPC Flow
        • AWS VPC Transit Gateway Flow
        • AWS WAF
      • Cloudflare
        • Audit Logs
        • Firewall Events
        • HTTP Requests
        • Other Datasets
      • Crowdstrike
      • Custom via Fluentd
      • Fastly
      • GitHub
      • Jamf
      • Lacework
      • Osquery
      • OSSEC
      • Sophos
      • Sublime Security
      • Suricata
      • Syslog
      • Teleport
      • Windows Defender
      • Windows Sysmon
      • Zeek
  • Indexing Your Logs in S3
    • Linking AWS Accounts
      • Manual setup
        • AWS CloudShell
      • Infra-as-code
        • AWS CloudFormation
        • Terraform
        • Pulumi
    • Creating S3 Import Rules
      • Configuration - Basic
      • Configuration - Optional Transformations
      • Previewing Imports
      • Regular Expressions in Import Rules
  • Using Scanner
    • Query Syntax
    • Aggregation Functions
      • avg()
      • count()
      • countdistinct()
      • eval()
      • groupbycount()
      • max()
      • min()
      • percentile()
      • rename()
      • stats()
      • sum()
      • table()
      • var()
      • where()
    • Detection Rules
      • Event Sinks
      • Out-of-the-Box Detection Rules
      • MITRE Tags
    • API
      • Ad hoc queries
      • Detection Rules
      • Event Sinks
      • Validating YAML files
    • Built-in Indexes
      • _audit
    • Role-Based Access Control (RBAC)
    • Beta features
      • Scanner for Splunk
        • Getting Started
        • Using Scanner Search Commands
        • Dashboards
        • Creating Custom Content in Splunk Security Essentials
      • Scanner for Grafana
        • Getting Started
      • Jupyter Notebooks
        • Getting Started with Jupyter Notebooks
        • Scanner Notebooks on Github
      • Detection Rules as Code
        • Getting Started
        • Writing Detection Rules
        • CLI
        • Managing Synced Detection Rules
      • Detection Alert Formatting
        • Customizing PagerDuty Alerts
      • Scalar Functions and Operators
        • coalesce()
        • if()
        • arr.join()
        • math.abs()
        • math.round()
        • str.uriencode()
  • Single Sign On (SSO)
    • Overview
    • Okta
      • Okta Workforce
      • SAML
  • Self-Hosted Scanner
    • Overview
Powered by GitBook
On this page
  • How It Works
  • How It Gets Deployed
  • How It Gets Maintained Over Time
  • Why Choose Self-hosted Scanner?
  • Get started

Was this helpful?

  1. Self-Hosted Scanner

Overview

PreviousSAML

Last updated 24 days ago

Was this helpful?

Self-hosted Scanner is a powerful option for running Scanner’s log analysis and detection capabilities entirely within your own AWS environment.

This article provides an overview of how Self-hosted Scanner works, how it’s deployed, and how it’s maintained over time.

How It Works

Self-hosted Scanner brings the full power of Scanner’s compute infrastructure into your AWS account, offering the same functionality as our Managed Scanner (SaaS) solution, but with the added benefit of running entirely in your AWS account. This makes it an ideal choice for teams with strict security or compliance requirements that prevent reliance on external SaaS platforms.

Like Managed Scanner, Self-hosted Scanner processes and analyzes log data at scale, delivering fast search and real-time detections.

You interact with it through the familiar Scanner user interface, a single-page JavaScript application hosted on our AWS CloudFront CDN. The UI connects to WebSocket and HTTP servers running in your AWS account, ensuring a smooth experience while keeping your data and compute resources in-house.

Self-hosted Scanner utilizes Auth0 for authentication session management, ensuring that user access to the UI is both secure and seamless.

Key benefits include:

  • All data flows within your AWS environment: With Self-hosted Scanner, log data stays entirely within your AWS organization—moving securely between your S3 buckets and the Scanner AWS account via a VPC Gateway endpoint. No log data traverses the public internet, keeping your workflows fully internal.

  • Scaling without breaking the budget: Scanner scales effortlessly to index and analyze tens of terabytes per day, leveraging your AWS discounts and commitments for cost-effective growth—no budget surprises, even at enterprise scale.

  • Compliance for sensitive data: Since all data remains within your AWS environment, Self-hosted Scanner simplifies compliance with stringent requirements, offering greater control compared to SaaS solutions.

Compared to traditional SIEMs, Self-hosted Scanner offers 5-10x lower costs for high log volumes, with a pricing model based on terabytes indexed per day (TB/day). for details tailored to your needs.

How It Gets Deployed

Deploying Self-hosted Scanner is a process designed to get you up and running quickly with minimal effort on your part. Here’s how it works:

  1. Setup by Scanner: Our team creates a new AWS account and provisions the necessary compute infrastructure—such as EC2 instances, ECS clusters, RDS databases, and S3 connectivity—in the AWS region of your choice (e.g., us-east-1). We recommend selecting the same region as your S3 buckets to avoid data transfer costs, using a VPC gateway endpoint for free, intra-region S3 access.

  2. Account Transfer: Once configured, we transfer ownership of the AWS account from Scanner’s AWS organization to yours. This ensures you have full control over the environment.

  3. S3 Bucket Integration: After the transfer, you’ll integrate the S3 buckets containing the logs you want Scanner to index and configure S3 import rules. This involves specifying which S3 buckets contain logs for Scanner to index and analyze, along with basic settings like timestamp extraction (via fields or regex) and optional transformations (e.g., normalizing to Elastic Common Schema - ECS).

  4. Onboarding: Provisioning a new Self-Hosted Scanner AWS account typically takes one business day. Following the transfer, we schedule a 1-hour concierge onboarding session to help you integrate your first log sources and ensure everything is running smoothly.

Note: The standard deployment for Self-Hosted Scanner involves the Scanner team creating a new AWS account, bootstrapping the environment, and transferring ownership of the account to you. If you prefer not to transfer ownership, the Scanner team can initialize the environment directly in your AWS account. However, it is required that Scanner be deployed into a new AWS account dedicated exclusively to running the Scanner compute infrastructure. Contact us to discuss your needs.

The infrastructure auto-scales to handle your log volume, and while region selection is key, other settings are managed automatically. No advanced technical expertise is required—just a few straightforward steps to connect your data.

How It Gets Maintained Over Time

Once deployed, Self-hosted Scanner is a hands-off experience for your team. Our engineering and operations teams handle maintenance, updates, and scaling, so you can focus on using the insights Scanner provides. Here’s what to expect:

  • Automatic Updates: New versions of Scanner’s backend are deployed automatically by our team using a deployer IAM role retained in your AWS account.

  • Ops and Debugging: For tasks like debugging, running database migrations, or scaling infrastructure, our ops team may access your AWS account with temporary Assume Role sessions. You have full visibility into these actions via CloudTrail logs, and for specific data issues, you can invite a Scanner ops member to log in to your team's account via the Scanner UI (and remove them afterward).

  • No Customer Maintenance: You’re not responsible for any ongoing tasks—Scanner handles everything from infrastructure scaling to meeting service-level agreements (SLAs) for performance and reliability.

  • Support When You Need It: If an issue arises (e.g., a bug or outage), our team is alerted automatically and works to resolve it. You can reach us anytime via a private Slack channel, email, or our 24/7 emergency phone number for urgent matters.

The deployer role is assigned an IAM policy with AdministratorAccess permissions within the Self-hosted Scanner AWS account. This gives the role the ability to manage infrastructure changes during deployments. It also allows Scanner admins to perform essential operations and maintenance tasks, such as scaling resources or debugging issues. It only has full administrator access to the account in which the Scanner compute runs. This is why it is required that the AWS account be solely dedicated to Scanner. Your team retains full visibility into all of the actions of this role via CloudTrail logs.

Why Choose Self-hosted Scanner?

Self-hosted Scanner combines the ease of a managed service with the flexibility of a self-hosted solution. It’s perfect for organizations needing to:

  • Meet compliance requirements - data storage and processing all remains within your AWS environment.

  • Leverage AWS budgets and discounts for compute infrastructure.

  • Process high log volumes affordably with Search and Detections features.

Get started

While you’re responsible for the AWS compute costs (which you can estimate based on log volume—), Scanner takes care of the rest. This balance gives you control over your budget and environment without the operational overhead.

To get started with deploying Self-hosted Scanner, . They will give you more information about the process, set up a Proof of Concept trial, learn about the problems you want to solve, and get you started.

Contact our sales team
ask sales for more info
contact our sales team