Scanner MCP Tools Reference
Scanner provides five MCP tools that enable AI agents and interactive clients to query security data, explore your environment, and execute threat hunting operations.
Overview
get_scanner_context
Load a condensed Scanner query reference, available indexes, and source types
Called first by AI agents
get_docs
Retrieve detailed documentation for a specific topic
On-demand reference lookup
get_top_columns
Discover the most frequent column names for one or more indexes
Field discovery before writing queries
execute_query
Run ad-hoc queries against Scanner logs
Core query execution
fetch_query_results
Retrieve specific fields from cached query results
Result refinement
Design: Efficient Context Management
Scanner MCP tools are designed to work together efficiently, avoiding context bloat that would consume excessive tokens and reduce AI agent capability.
Querying the Right Data
The Challenge: Before an AI agent can write a useful query, it needs to know what data exists, what fields are available, and how the query language works. Loading all of this context upfront—full documentation, every index's schema, all field statistics—consumes thousands of tokens before a single query is run, leaving less room for actual investigation.
The Solution: A layered, on-demand discovery pattern:
get_scanner_context returns a condensed starting point—a query documentation table of contents, a compact list of available indexes, and source types. This gives the AI enough orientation to begin without loading full documentation or field metadata.
get_docs provides detailed documentation for a specific topic (
syntax,aggregation,examples, orbest_practices_and_mistakes) only when the AI actually needs it—for example, looking up aggregation syntax before writing a stats query.get_top_columns returns the most frequently occurring column names for one or more indexes. The AI calls this to discover available fields for the specific indexes it plans to query, rather than receiving column statistics for every index upfront.
This means the AI learns just enough to start, then deepens its knowledge on demand as the investigation requires.
Efficient Result Retrieval
The Challenge: Security queries often return thousands of rows. Loading all raw results into the AI's context would:
Consume massive amounts of tokens
Reduce the AI's ability to reason about findings
Make investigations slower and more expensive
The Solution: A two-stage result retrieval pattern:
execute_query returns a summary of results, not raw data:
Field names and top values for each field
Row count and data patterns
Statistical summaries (counts, distributions, time ranges)
A
result_handlefor later access
fetch_query_results allows selective retrieval:
AI examines the summary from execute_query
AI decides which fields and rows are actually relevant
Uses fetch_query_results to pull only those specific fields/rows into context
Avoids loading irrelevant data
Example: Query returns 5,000 S3 access events. Rather than load all 5,000 rows with all fields (massive context), the AI:
Sees the summary showing event types, usernames, buckets involved
Identifies which 50 rows are suspicious
Fetches only those 50 rows with only relevant fields (
eventTime,userName,bucketName)Saves 90% of token usage while retaining key context
This design enables AI agents to handle large datasets without context limitations becoming a bottleneck.
1. get_scanner_context
Purpose: Load a condensed Scanner query reference, available indexes, and source types.
Key Points:
Called first — AI agents are instructed to call this before executing any queries
Returns a context_token — Required input for
execute_queryandget_top_columnsDiscovers your data — Shows what indexes and source types are available in your Scanner instance
Condensed reference — Returns a table of contents for query documentation, not the full docs (use
get_docsfor details)
What It Provides:
A condensed Scanner query language reference (table of contents)
Available indexes with names and descriptions
Discovered source types (e.g.,
aws:cloudtrail,kubernetes:audit,proxy,dns)A
context_tokenfor use with other tools
When to Use:
Start of any investigation or query session
When exploring what data sources are available
Example Usage:
2. get_docs
Purpose: Retrieve detailed documentation for a specific Scanner query language topic.
Required Parameters:
section — One of:
syntax,aggregation,examples,best_practices_and_mistakes
Available Sections:
syntax — Query syntax rules, operators, and grammar
aggregation — Aggregation functions and usage
examples — Example queries for common use cases
best_practices_and_mistakes — Best practices and common errors to avoid
Key Features:
On-demand loading — Fetch only the documentation you need, when you need it
No context_token required — Can be called at any time
Reduces token usage — Avoids loading thousands of tokens of documentation upfront
When to Use:
Before writing a query, to look up syntax or aggregation functions
When a query fails, to review best practices and common mistakes
To find example queries for a specific use case
Example Usage:
3. get_top_columns
Purpose: Discover the most frequently occurring column names for one or more indexes.
Required Parameters:
context_token — Obtained from
get_scanner_contextindices — List of index names to query (e.g.,
["my-cloudtrail-index", "_detections"])
Key Features:
Multi-index support — Query columns for multiple indexes in a single call
Frequency-sorted — Returns columns sorted by how often they appear in your data
Compact format — Results use a tuple format (
["column_name", count]) to minimize token usageTime-scoped — Reflects column usage from the last 7 days
When to Use:
Before writing a query, to discover what fields are available in an index
When exploring an unfamiliar data source
To understand the schema of your log data
Example Usage:
4. execute_query
Purpose: Execute an ad-hoc query against Scanner to search your security logs.
Required Parameters:
context_token — Obtained from
get_scanner_context(proves you've loaded syntax rules)query — Your Scanner Query Language query
start_time — Query start time (ISO 8601 format, inclusive)
end_time — Query end time (ISO 8601 format, exclusive)
Optional Parameters:
max_rows — Maximum rows to return (default: 1000, max: 10000)
max_bytes — Memory limit in bytes (default: 128MB)
Key Features:
Blocking execution — Waits for results (supports configurable timeouts)
Result caching — Results are cached for subsequent operations
Field-level summaries — Returns summaries to reduce token usage
Time-bounded queries — Scope queries to specific time windows
What It Returns:
Query execution status
Result summary with key findings
A
result_handlefor fetching detailed results withfetch_query_resultsRow count and execution time
Sample of returned fields
When to Use:
Execute threat hunting queries
Search for specific events or patterns
Investigate alerts or incidents
Test detection rule logic
Explore data patterns
Example Usage:
5. fetch_query_results
Purpose: Retrieve specific fields and rows from previously executed query results.
Required Parameters:
result_handle — Cache handle returned from
execute_queryfields — List of field names to retrieve (required)
Optional Parameters:
limit — Maximum matching rows to return (default: 50, max: 1000)
offset — Rows to skip before filtering (default: 0)
row_filter_regex — Regex pattern to filter which rows are returned
Key Features:
Selective field retrieval — Fetch only the fields you need
Pattern matching — Filter results with regex to find specific events
Pagination support — Use offset and limit for large result sets
Efficient browsing — Avoids re-executing queries
When to Use:
Get detailed results after an initial query
Extract specific fields from large result sets
Filter results to find matching events
Drill down into query results progressively
Refine investigation based on initial findings
Example Usage:
Version Changelog
Scanner MCP is versioned. Here's what changed between releases.
v0.0.2 (current)
get_scanner_contextreturns condensed output — Instead of returning full inline documentation and field statistics, it now returns a compact query reference table of contents, a slim list of available indexes, and source types. This significantly reduces token usage on the initial call.New tool:
get_docs— Retrieve detailed documentation for a specific topic (syntax,aggregation,examples,best_practices_and_mistakes). Replaces the inline documentation previously returned byget_scanner_context.New tool:
get_top_columns— Discover the most frequently occurring columns for one or more indexes. Replaces the per-index field statistics previously returned byget_scanner_context.
v0.0.1
Initial release with three tools:
get_scanner_context,execute_query,fetch_query_results.get_scanner_contextreturns full inline documentation and field statistics for all indexes.
Related Documentation
Getting Started — Setup instructions for Scanner MCP in different tools
Detection Engineering — Use Scanner MCP to build and validate detection rules
Autonomous Workflows — Build agents using Scanner MCP tools
Interactive Investigations — Run investigations with Scanner MCP
Last updated
Was this helpful?