Part 1: Search and Analysis
In this part of the guide, we'll use Scanner Search to analyze our log data to uncover threat behavior. We'll walk through everything from basic querying to generating aggregation summaries.
1. Run a saved query
First, let's run a saved query to look for AWS API calls that failed with an error code.
Some failures are normal, but other kinds of failures, like "access denied" failures, might indicate malicious activity.
Open up the Query side bar on the left, open up Library, and select Playground - AWS CloudTrail Error Events. Click Query to load the saved query into the query text box.
The query looks for all log events in the playground
index whose source type is aws:cloudtrail
and where errorCode
is present.
Above the query text box, change the time range from Last 15 mintues to Last 1 day.
Click Run.
2. View column statistics
You can view statistics about the most common values of the columns in your search results.
It's important to note that these column value statistics are computed on only the 1,000 log events that have been loaded into the search results table.
Open up the Column sidebar on the right.
Click on the errorCode
column name. You'll see that AccessDenied
is one of the most common values.
That seems suspicious, so we will dig in to these AccessDenied
API calls further.
Click the +
button next to AccessDenied
to add errorCode="AccessDenied"
to the query.
Click Run.
3. View log event details
The search results should now show AWS API calls that failed with an AccessDenied
error.
In the search results, click on one of the rows to view the log event details.
Look for the column called errorMessage
, and take a look at the message it contains.
It looks like a particular user is trying to modify IAM policies. This is a bit suspicious - it could be an attempt at privilege escalation. Let's dig in further.
4. Run free-text search
In the errorMessage
column, you should see text like this:
Let's focus our investigation on this user.
Use your mouse to drag-select the ARN (Amazon Resource Name) of the user in the text.
After selecting the text, you will see a dialog box come up with a few options.
Click on the option that says Query selected tokens. This will replace the current query text.
This query will look for any logs that contain the IAM user ARN in any of their columns.
Click Run.
Scanner can search for your query text across all columns in the data. You don't need to be intimately familiar with the schema of your data to run a search query.
5. Summarize data with a simple count aggregation
There are a lot of log events that contain this IAM user ARN. Instead of inspecting all of these logs individually, let's summarize them, computing a simple count aggregated by some columns. Specifically, we'll aggregate the events by the error code we see, the AWS service, the API operation, and the user ARN.
In the query box, add this text on a new line at the end of the query:
This tells Scanner to "pipe" the search results into a groupbycount
function, aggregated by those columns.
The final query should look something like this:
Every Scanner query consists of a filter step at the beginning, and then zero or more processing steps. Each step is separated by a "pipe" character: |
For example, in the query above, the filter step is this:
And it is followed by one processing step, which is this:
Processing steps can be used to do many things, like rename columns in logs and generate summaries and aggregations.
For more information, see the documentation on Query Syntax and Aggregations.
Click Run.
In the results, we see that the user is attempting to run various IAM commands, which all fail with AccessDenied
.
However, it looks like the user is executing a lot of GetObject
requests against the s3.amazonaws.com
service, and they are all succeeding.
This could be a case of data exfiltration of data from an AWS S3 bucket.
6. Pivot from summary data to detailed logs
Let's now pivot from the summary table in front of us to look at detailed logs for this user.
Delete the groupbycount
aggregation part of the query. That is, delete everything from the |
character onward.
Hover your mouse over the eventSource
column in the row that says s3.amazonaws.com
. Click to copy that value. Paste the value into the query text box after the IAM user ARN.
Do the same for eventName
in the row that says GetObject
. Paste the value into the query text box.
The query should now look something like this:
The filter query above is equivalent to this: (Notice the and
between each term)
In general, these two filter queries are the same:
Click Run.
When the query is finished, click on one of the rows to view the full log event details of one of these S3 GetObject
requests.
Let's try to see what data is being downloaded by this user.
Since there are many columns in this log event, let's use the Filter box to search for specific columns. (The Filter box is in the log event details panel that opens up when you click on a row.)
Type requestParameters
into the Filter box to look at just the columns related to the request parameters.
We can see a few columns now, requestParameters.bucketName
and requestParameters.key
. From the names of the S3 bucket and key, this data appears to be sensitive. This could be a data exfiltration attack.
7. Summarize data with rename, stats, and sum
Let's summarize all of the S3 data exfiltration activity from this user. In particular, let's find out which buckets are being read from, how much data is being exfiltrated, and so on.
To make the results a bit easier to deal with, we'll use rename
to change the names of some of the long column names to be shorter. Specifically, we'll add this to the end of the query:
Then we'll use stats
and sum
to add up the total number of bytes exfiltrated, aggregated by S3 bucket, IP address, and user name. We will add this to the end of the query:
The full query should now look something like this:
Click Run.
When we look at the results, we can see that the user is downloading data from a few S3 buckets.
8. Pivot to view if other users are involved
Let's pivot the results a bit to see the top users downloading data from these buckets. To do this, we need to edit the query text a bit.
Specifically, let's edit the filter step of the query, which is everything in the query before the first |
pipe character, i.e. this:
First, delete the IAM user ARN from the query to "zoom out" to view all users interacting with S3.
Next, add a filter term to match logs where the requestParameters.bucketName
column is any of the S3 buckets from the summary results. The filter step of the query should now like this:
Note that this query:
is equivalent to this query (with or
between each column value in the list):
It's also equivalent to this query:
In general, these three filter queries are equivalent:
Keep the rename
and stats
steps of the query.
Here is what the final query should look like:
Click Run.
9. Pivot to view the full shape of the attack
In the results, we can see that there are multiple IAM users exfiltrating data from these S3 buckets.
Let's try to understand if these user identities are associated with one another.
First, let's get a unique list of user names by adding a groupbycount user_name
to the end of the query.
The full query should now look like this:
We can see a list of three users who are involved in the threat activity.
Let's zoom out and view all of the AWS activity of these three users.
First, delete the entire query.
Then, add a filter section to the query that looks at all activity for the three users from the results table:
Note: You don't always need to use =
. You can also use :
in a query like column: value
. The :
operator will find hits where column
contains the text tokens from value
. So, for example, you could also run this query to check for all IAM user ARNs that contain the three user names.
Next, let's summarize all of their AWS activity across all services, and group by a few things:
Did the operation fail or succeed:
errorCode
The AWS service:
eventSource
The API operation that was executed:
eventName
This can be accomplished with adding stats
to the end of the query, which always includes at least a count aggregation field called @q.count
:
The full query should look like this.
Click Run.
Using groupbycount
will give you the same results as using stats
without any arguments before the by
keyword.
In other words, this:
will give you the same result as this:
Specifically, they will both give you a table with these columns:
field1
field2
@q.count
You can sort the result table by errorCode
by clicking on the errorCode
column header. This will partition the results into two groups: the operations that succeeded and the operations that failed.
It looks like these suspicious users have successfully executed commands against the lambda.amazonaws.com
service and the events.amazonaws.com
service. Specifically, it appears that Lambda functions are being updated and event rules are being created.
This is worrisome yet again. Modifying serverless functions and creating event rules, which are frequently used to schedule serverless function invocations, may be an instance of command and control.
Let's work to block the attack by building detection rules.
Last updated