scanner
  • About Scanner
  • When to use it
  • Architecture
  • Getting Started
  • Playground Guide
    • Overview
    • Part 1: Search and Analysis
    • Part 2: Detection Rules
    • Wrapping Up
  • Log Data Sources
    • Overview
    • List
      • AWS
        • AWS Aurora
        • AWS CloudTrail
        • AWS CloudWatch
        • AWS ECS
        • AWS EKS
        • AWS GuardDuty
        • AWS Lambda
        • AWS Route53 Resolver
        • AWS VPC Flow
        • AWS VPC Transit Gateway Flow
        • AWS WAF
      • Cloudflare
        • Audit Logs
        • Firewall Events
        • HTTP Requests
        • Other Datasets
      • Crowdstrike
      • Custom via Fluentd
      • Fastly
      • GitHub
      • Jamf
      • Lacework
      • Osquery
      • OSSEC
      • Sophos
      • Sublime Security
      • Suricata
      • Syslog
      • Teleport
      • Windows Defender
      • Windows Sysmon
      • Zeek
  • Indexing Your Logs in S3
    • Linking AWS Accounts
      • Manual setup
        • AWS CloudShell
      • Infra-as-code
        • AWS CloudFormation
        • Terraform
        • Pulumi
    • Creating S3 Import Rules
      • Configuration - Basic
      • Configuration - Optional Transformations
      • Previewing Imports
      • Regular Expressions in Import Rules
  • Using Scanner
    • Query Syntax
    • Aggregation Functions
      • avg()
      • count()
      • countdistinct()
      • eval()
      • groupbycount()
      • max()
      • min()
      • percentile()
      • rename()
      • stats()
      • sum()
      • table()
      • var()
      • where()
    • Detection Rules
      • Event Sinks
      • Out-of-the-Box Detection Rules
      • MITRE Tags
    • API
      • Ad hoc queries
      • Detection Rules
      • Event Sinks
      • Validating YAML files
    • Built-in Indexes
      • _audit
    • Role-Based Access Control (RBAC)
    • Beta features
      • Scanner for Splunk
        • Getting Started
        • Using Scanner Search Commands
        • Dashboards
        • Creating Custom Content in Splunk Security Essentials
      • Scanner for Grafana
        • Getting Started
      • Jupyter Notebooks
        • Getting Started with Jupyter Notebooks
        • Scanner Notebooks on Github
      • Detection Rules as Code
        • Getting Started
        • Writing Detection Rules
        • CLI
        • Managing Synced Detection Rules
      • Detection Alert Formatting
        • Customizing PagerDuty Alerts
      • Scalar Functions and Operators
        • coalesce()
        • if()
        • arr.join()
        • math.abs()
        • math.round()
        • str.uriencode()
  • Single Sign On (SSO)
    • Overview
    • Okta
      • Okta Workforce
      • SAML
  • Self-Hosted Scanner
    • Overview
Powered by GitBook
On this page
  • What is Scanner for Splunk?
  • What problem does it solve?
  • Access more data from Splunk, reduce blind spots

Was this helpful?

  1. Using Scanner
  2. Beta features

Scanner for Splunk

Run search queries on your high-volume logs in S3 directly from Splunk via custom commands.

PreviousBeta featuresNextGetting Started

Last updated 11 months ago

Was this helpful?

What is Scanner for Splunk?

Scanner provides a Splunk app that allows teams to rapidly search their object storage logs directly from Splunk.

It introduces two custom search commands system-wide in Splunk: scanner and scannertable.

What problem does it solve?

Indexing high-volume log sources in Splunk is often very expensive. Teams can reduce costs dramatically by redirecting these logs to S3 and using Scanner to index them at much lower cost, sometimes 80-90% less than Splunk.

It is still useful to query these high-volume log sources from Splunk, and Scanner allows you to do this at high speed, especially for needle-in-haystack queries. For example, searching a 100TB log data set for a list of IP addresses, emails, or UUIDs takes only 10 seconds in Scanner. Running the same search in a 1PB log data set takes about 100 seconds. This can be 10-100x faster than tools like Athena, especially against raw JSON logs that have not yet been highly optimized with Parquet and partitioning.

Dashboards

You can also execute Scanner queries to populate dashboards in Splunk. It is almost always best to use the scannertable command with dashboard queries since widgets tend to consume data in tabular format.

For example, this query computes aggregated counts of all S3 CloudTrail log events that are not GetObject. We can use it to generate a bar chart in the dashboard.

| scannertable q="%ingest.source_type: 'aws:cloudtrail' 
  and eventSource: 's3.amazonaws.com' and not eventName: 'GetObject'
  | stats by eventName"

Access more data from Splunk, reduce blind spots

If you want Splunk to be your single pane of glass where you can analyze both Splunk logs and the logs you have in object storage, Scanner can help you make this happen.

Using Scanner, you can run fast queries against your object storage logs, join them against your Splunk logs, create dashboards from object storage logs, and more.

With your object storage logs easily queryable from Splunk, you can avoid blind spots and keep Splunk costs low.

The scanner command returns results as log events
Create dashboard widgets using scannertable