When to use it

Once high-volume log sources start to increase costs dramatically, we recommend moving these logs to a data lake in S3 - and indexing them with Scanner for fast search.

Problem - modern log scale can become unsustainable

Scanner was designed to solve the problem of modern log scale. In our opinion, traditional log management tools and SIEMs become far too expensive once logs reach high volume.

If you are ingesting 100GB of logs day into a traditional SIEM, you might be spending on the order of $100k per year. This is somewhat pricey, but not too terrible.

However, as your company starts to grow, it's very easy to reach the point where you are generating 1TB of logs per day. This can cost $1M per year in traditional SIEM tools. This is extremely painful.

At this scale, teams often split their logs into two categories: low volume log sources, and high volume log sources.

Half of the total ingestion volume comes from only 3-5 high volume log sources, like web application firewall logs, VPC flow logs, CloudTrail logs, Cloudflare DNS and HTTP logs, etc.

These high volume logs tend to be less critical, but they are still incredibly helpful for investigations and detecting threats.

Solution - Move high volume logs to a data lake, and index the data lake with Scanner

Here's what we propose. Teams can continue to ingest their low volume log sources into their traditional SIEM, but they should move their high volume logs to a data lake in S3.

They can then use Scanner to index their data lake for fast search from Scanner's UI.

Cost improvement

Before Scanner

And now here is how the costs change. Before Scanner, the bill for ingesting 1TB of logs per day into the SIEM is around $1M/year. Low volume logs are responsible for $500k of costs, and high volume logs are also responsible for $500k of costs.

# of log sources

Ingest volume

Ingest cost

Low volume log sources in Traditional SIEM

25-100 log sources

500GB/day

$500k/year

High volume log sources in Traditional SIEM

3-5 log sources

500GB/day

$500k/year

Total

1TB/day

$1M/year

After Scanner

After moving high volume logs to an S3 data lake and indexing them with Scanner, the cost of high volume logs drops down 80% to $100k per year, reducing the overall cost from $1M down to $600k.

Low volume log sources in Traditional SIEM

25-100 log sources

500GB/day

$500k/year

High volume log sources in data lake indexed by Scanner

3-5 log sources

500GB/day

$100k/year

Total

1TB/day

$600k/year

By moving high volume logs to a data lake and indexing them with Scanner, overall costs are reduced by 40%, which can free up meaningful budget for other projects.

What are the tradeoffs?

When you move the high volume log sources out of a traditional SIEM and into a data lake in S3 indexed by Scanner, there can be strong cost savings, and your search speed in Scanner will continue to be fast, but there are some practical tradeoffs to consider.

You can run queries supported by Scanner's query language, which may not be exactly the same as the queries supported by your prior log tool. For more information about the kinds of queries supported by Scanner's query language, see:
- Query Syntax
- Aggregation Functions

PreviousAbout Scanner NextArchitecture

Last updated 3 months ago

Was this helpful?