Part 1: Search and Analysis

In this part of the guide, we'll use Scanner Search to analyze our log data to uncover threat behavior. We'll walk through everything from basic querying to generating aggregation summaries.

1. Run a saved query

First, let's run a saved query to look for AWS API calls that failed with an error code.

Some failures are normal, but other kinds of failures, like "access denied" failures, might indicate malicious activity.

Open up the Query side bar on the left, open up Library, and select Playground - AWS CloudTrail Error Events. Click Query to load the saved query into the query text box.

The query looks for all log events in the playground index whose source type is aws:cloudtrail and where errorCode is present.

Above the query text box, change the time range from Last 15 mintues to Last 1 day.

Click Run.

2. View column statistics

You can view statistics about the most common values of the columns in your search results.

It's important to note that these column value statistics are computed on only the 1,000 log events that have been loaded into the search results table.

Open up the Columns sidebar on the right.

Click on the errorCode column name. You'll see that AccessDenied is one of the most common values.

That seems suspicious, so we will dig in to these AccessDenied API calls further.

Click the + button next to AccessDenied to add errorCode="AccessDenied" to the query.

Click Run.

3. View log event details

The search results should now show AWS API calls that failed with an AccessDenied error.

In the search results, click on one of the rows to view the log event details.

Look for the column called errorMessage, and take a look at the message it contains.

It looks like a particular user is trying to modify IAM policies. This is a bit suspicious - it could be an attempt at privilege escalation. Let's dig in further.

4. Run free-text search

In the errorMessage column, you should see text like this:

User: arn:aws:iam::798029671665:user/pgibbons is not authorized to perform: iam:PutRolePolicy on resource: arn:aws:iam::798029671665:role/data_maint_6fb4f0

Let's focus our investigation on this user.

Use your mouse to drag-select the ARN (Amazon Resource Name) of the user in the text.

arn:aws:iam::798029671665:user/pgibbons

After selecting the text, you will see a dialog box come up with a few options.

Click on the option that says Query selected tokens. This will replace the current query text.

This query will look for any logs that contain the IAM user ARN in any of their columns.

Click Run.

Scanner can search for your query text across all columns in the data. You don't need to be intimately familiar with the schema of your data to run a search query.

Note that non-wildcard text search will only match full tokens in the text. For more information, see the documentation on Token Boundaries.

5. Summarize data with a simple count aggregation

There are a lot of log events that contain this IAM user ARN. Instead of inspecting all of these logs individually, let's summarize them, computing a simple count aggregated by some columns. Specifically, we'll aggregate the events by the error code we see, the AWS service, the API operation, and the user ARN.

In the query box, add this text on a new line at the end of the query:

| groupbycount errorCode, eventSource, eventName, userIdentity.arn

This tells Scanner to "pipe" the search results into a groupbycount function, aggregated by those columns.

The final query should look something like this:

"arn:aws:iam::798029671665:user/pgibbons"
| groupbycount errorCode, eventSource, eventName, userIdentity.arn

Every Scanner query consists of a filter step at the beginning, and then zero or more processing steps. Each step is separated by a "pipe" character: |

For example, in the query above, the filter step is this:

"arn:aws:iam::798029671665:user/pgibbons"

And it is followed by one processing step, which is this:

| groupbycount errorCode, eventSource, eventName, userIdentity.arn

Processing steps can be used to do many things, like rename columns in logs and generate summaries and aggregations.

For more information, see the documentation on Query Syntax and Aggregation Functions.

Click Run.

In the results, we see that the user is attempting to run various IAM commands, which all fail with AccessDenied.

However, it looks like the user is executing a lot of GetObject requests against the s3.amazonaws.com service, and they are all succeeding.

This could be a case of data exfiltration of data from an AWS S3 bucket.

6. Pivot from summary data to detailed logs

Let's now pivot from the summary table in front of us to look at detailed logs for this user.

Delete the groupbycount aggregation part of the query. That is, delete everything from the | character onward.

Hover your mouse over the eventSource column in the row that says s3.amazonaws.com. Click to copy that value. Paste the value into the query text box after the IAM user ARN.

Do the same for eventName in the row that says GetObject. Paste the value into the query text box.

The query should now look something like this:

"arn:aws:iam::798029671665:user/pgibbons"
s3.amazonaws.com
GetObject

The filter query above is equivalent to this: (Notice the and between each term)

"arn:aws:iam::798029671665:user/pgibbons" and
s3.amazonaws.com and
GetObject

In general, these two filter queries are the same:

term1 term2 term3

term1 and term2 and term3

Click Run.

When the query is finished, click on one of the rows to view the full log event details of one of these S3 GetObject requests.

Let's try to see what data is being downloaded by this user.

Since there are many columns in this log event, let's use the Filter box to search for specific columns. (The Filter box is in the log event details panel that opens up when you click on a row.)

Type requestParameters into the Filter box to look at just the columns related to the request parameters.

We can see a few columns now, requestParameters.bucketName and requestParameters.key. From the names of the S3 bucket and key, this data appears to be sensitive. This could be a data exfiltration attack.

7. Summarize data with rename, stats, and sum

Let's summarize all of the S3 data exfiltration activity from this user. In particular, let's find out which buckets are being read from, how much data is being exfiltrated, and so on.

To make the results a bit easier to deal with, we'll use rename to change the names of some of the long column names to be shorter. Specifically, we'll add this to the end of the query:

| rename 
  additionalEventData.bytesTransferredOut as bytes_exfiltrated,
  requestParameters.bucketName as s3_bucket,
  userIdentity.userName as user_name

Then we'll use stats and sum to add up the total number of bytes exfiltrated, aggregated by S3 bucket, IP address, and user name. We will add this to the end of the query:

| stats sum(bytes_exfiltrated) as total_bytes_exfiltrated 
  by s3_bucket, user_name

The full query should now look something like this:

"arn:aws:iam::798029671665:user/pgibbons"
s3.amazonaws.com
GetObject
| rename 
  additionalEventData.bytesTransferredOut as bytes_exfiltrated,
  requestParameters.bucketName as s3_bucket,
  userIdentity.userName as user_name
| stats sum(bytes_exfiltrated) as total_bytes_exfiltrated 
  by s3_bucket, user_name

Click Run.

When we look at the results, we can see that the user is downloading data from a few S3 buckets.

8. Pivot to view if other users are involved

Let's pivot the results a bit to see the top users downloading data from these buckets. To do this, we need to edit the query text a bit.

Specifically, let's edit the filter step of the query, which is everything in the query before the first | pipe character, i.e. this:

"arn:aws:iam::798029671665:user/pgibbons"
s3.amazonaws.com
GetObject

First, delete the IAM user ARN from the query to "zoom out" to view all users interacting with S3.

Next, add a filter term to match logs where the requestParameters.bucketName column is any of the S3 buckets from the summary results. The filter step of the query should now like this:

s3.amazonaws.com
GetObject
requestParameters.bucketName=(
  initech-prod1-customer-financial-txns
  initech-prod1-payment-processing-logs
)

Note that this query:

requestParameters.bucketName=(
  initech-prod1-customer-financial-txns
  initech-prod1-payment-processing-logs
)

is equivalent to this query (with or between each column value in the list):

requestParameters.bucketName=(
  initech-prod1-customer-financial-txns or
  initech-prod1-payment-processing-logs
)

It's also equivalent to this query:

requestParameters.bucketName=initech-prod1-customer-financial-txns or
requestParameters.bucketName=initech-prod1-payment-processing-logs

In general, these three filter queries are equivalent:

column=(value1 value2 value3)

column=(value1 or value2 or value3)

column=value1 or column=value2 or column=value3

Keep the rename and stats steps of the query.

Here is what the final query should look like:

s3.amazonaws.com
GetObject
requestParameters.bucketName=(
  initech-prod1-customer-financial-txns
  initech-prod1-payment-processing-logs
)
| rename 
  additionalEventData.bytesTransferredOut as bytes_exfiltrated,
  requestParameters.bucketName as s3_bucket,
  userIdentity.userName as user_name
| stats sum(bytes_exfiltrated) as total_bytes_exfiltrated 
  by s3_bucket, user_name

Click Run.

9. Pivot to view the full shape of the attack

In the results, we can see that there are multiple IAM users exfiltrating data from these S3 buckets.

Let's try to understand if these user identities are associated with one another.

First, let's get a unique list of user names by adding a groupbycount user_name to the end of the query.

The full query should now look like this:

s3.amazonaws.com
GetObject
requestParameters.bucketName=(
  initech-prod1-customer-financial-txns
  initech-prod1-payment-processing-logs
)
| rename 
  additionalEventData.bytesTransferredOut as bytes_exfiltrated,
  requestParameters.bucketName as s3_bucket,
  userIdentity.userName as user_name
| stats sum(bytes_exfiltrated) as total_bytes_exfiltrated 
  by s3_bucket, user_name
| groupbycount user_name

We can see a list of three users who are involved in the threat activity.

Let's zoom out and view all of the AWS activity of these three users.

First, delete the entire query.

Then, add a filter section to the query that looks at all activity for the three users from the results table:

userIdentity.userName=(pgibbons mbolton samirn)

Note: You don't always need to use =. You can also use : in a query like column: value. The : operator will find hits where column contains the text tokens from value. So, for example, you could also run this query to check for all IAM user ARNs that contain the three user names.

userIdentity.arn:(pgibbons mbolton samirn)

Next, let's summarize all of their AWS activity across all services, and group by a few things:

Did the operation fail or succeed: errorCode
The AWS service: eventSource
The API operation that was executed: eventName

This can be accomplished with adding stats to the end of the query, which always includes at least a count aggregation field called @q.count:

| stats by errorCode, eventSource, eventName

The full query should look like this.

userIdentity.userName=(pgibbons mbolton samirn)
| stats by errorCode, eventSource, eventName

Click Run.

Using groupbycount will give you the same results as using stats without any arguments before the by keyword.

In other words, this:

... | groupbycount field1, field2

will give you the same result as this:

... | stats by field1, field2

Specifically, they will both give you a table with these columns:

field1
field2
@q.count

You can sort the result table by errorCode by clicking on the errorCode column header. This will partition the results into two groups: the operations that succeeded and the operations that failed.

It looks like these suspicious users have successfully executed commands against the lambda.amazonaws.com service and the events.amazonaws.com service. Specifically, it appears that Lambda functions are being updated and event rules are being created.

This is worrisome yet again. Modifying serverless functions and creating event rules, which are frequently used to schedule serverless function invocations, may be an instance of command and control.

Let's work to block the attack by building detection rules.

PreviousOverview NextPart 2: Detection Rules

Last updated 3 months ago

Was this helpful?