Have you ever wondered how many failed
ConsoleLogin attempts you had last month and which regions or IP addresses they originated from? Do you want to know who has been creating all those AWS Accounts in your Organization? Or who created a specific EC2 instance that is completely untagged? (See Auto Tag and Retro Tag).
The good news is that there is a magical data source with answers to all these questions and more – CloudTrail. The bad news is that it is huge and tricky to search without the right guidance and tools.
A comprehensive guide to CloudTrail Event Names
In this blog post and a series of companion posts, we will explore how to search your CloudTrail data quickly and efficiently, and what types of insights you can glean from this data source.
Athena is a fully managed, query service that doesn’t require you to configure any servers. It allows you to search your unstructured data in S3 using SQL and pay per query. We specify our CloudTrail S3 bucket and, as you will see below, our different partition keys and we can start to search our CloudTrail data efficiently and inexpensively.
Running queries over S3 data and paying per query sounds great. However, if the data set is large and queries require the loading of most/all data, your queries can take a fail with timeouts. As a result, your Athena bill can start to inflate.
Looking at the contents of your CloudTrail buckets, you will see that the paths of each log file contain the following:
If we’re trying to query our CloudTrail data for data from the last 90 days for a subset of accounts or just a particular region, it doesn’t make sense to load all CloudTrail data or all years, accounts and regions.
That’s why partitioning your database is important and why using partition keys in your queries will help you keep costs low and queries fast.
Typically, to start adding partitions to your table, you would need to run queries like this across all the permutations of partition keys in your environment:
ALTER TABLE cloudtrail_logs ADD PARTITION (account='<account id>', region='<region>', year='<year>', month='<month>', day='<day>') LOCATION 's3://<cloudtrail s3 bucket name>/AWSLogs/<org id>/<account id>/CloudTrail/<region>/<year>/<month>/<day>/';
As you can imagine, the number of accounts, regions, years, months and days within an organization’s CloudTrail bucket can make this process cumbersome and time consuming. As time continues, you will likely want to add even more partitions for other accounts, regions, years, months and days.
This sounds like a perfect fit for an AWS Glue Crawler. Glue Crawlers explore the data available in S3, modify the schema based on the shape of the data discovered and add partitions based on partition keys discovered. However, after trialing this approach, we found that it was much slower (and therefore more expensive) than we expected. Furthermore, a bug in the Glue Crawlers modified the schema, breaking some of the column name constraints that are required by Hive.
What we needed was essentially a quick and easy mechanism for periodic partition creation.
Because of all of these challenges, we created the open-source Athena CloudTrail Partitioner to help you:
account/region/year/month/daykeys available in the CloudTrail S3 bucket
Follow the instructions in the
README.md to get started.
We’re just going to touch on a few questions in this post. We’ll explore a range of other interesting questions in a series of later posts.
Which IAM identities are the top successful users of IAM:AssumeRole in the last 30 days?
SELECT useridentity.sessioncontext.sessionissuer.arn, count(useridentity.sessioncontext.sessionissuer.arn) as count FROM cloudtrail_logs WHERE year = '<YEAR>' AND account = '<ACCOUNT>' AND eventname = 'AssumeRole' AND errorcode is null AND from_iso8601_timestamp(eventtime) > date_add('day', -30, now()) GROUP BY useridentity.sessioncontext.sessionissuer.arn;
Who is attempting to assume the role and how many times has it failed? This could reveal broken automation, broken privileges or something much worse.
How many AssumeRole attempts have failed by User Identity in the last fortnight?
SELECT useridentity.sessioncontext.sessionissuer.arn, count(useridentity.sessioncontext.sessionissuer.arn) as count FROM cloudtrail_logs WHERE year = '<YEAR>' AND account = '<ACCOUNT>' -- want this to run quicker? Just specify the month(s) too (as this is part of the partition) AND eventname = 'AssumeRole' AND errorcode = 'AccessDenied' AND from_iso8601_timestamp(eventtime) > date_add('day', -14, now()) GROUP BY useridentity.sessioncontext.sessionissuer.arn;
It is best practice not to use the
Root user on your AWS Account. It is a good idea to periodically query as to whether or not anyone has used this user in the last 90 days. If there are results, go have a conversation with your teams to get an understanding of why it happened and whether or not it can be prevented in the future.
SELECT DISTINCT COUNT(*) FROM cloudtrail_logs WHERE year = '<YEAR>' AND account = '<ACCOUNT>' AND useridentity.type = 'Root' AND useridentity.invokedby != 'support.amazonaws.com' AND from_iso8601_timestamp(eventtime) > date_add('day', -90, now());
There are a number of interesting insights that you can garner from your CloudTrail logs using Athena. More importantly, how you can use it to protect your cloud such as monitoring IAM activities.
If you want to learn more, check out the on demand webinar: Actionable Security Insights using CloudTrail and Athena, or reach out to us on our public slack channel.