March 23, 2021

Query WAF logs with Athena

I’ve been using AWS WAF a bit recently and I needed a way to query the logs that are shipped to S3.

Athena is the logical solution to this. There is great documentation to get you started with querying WAF logs via Athena and also how to setup WAF logging.

My specific requirement required me to start off with the WAF in COUNT mode. While WAF has an excellent facility to to see samples, they only last for 3 hours so I needed the ability to get the logs.

Looking at the documentation it seems it would be logical to use a where clause of action='COUNT', this is not that case, as the action is the final action and COUNT is a what is referred to as a non terminating rule.

Get the results I needed we have to look at the schema and use some of the presto functions to query and filter on the actual array data.

I’ve come up with this query as a result:

SELECT to_iso8601(from_unixtime(timestamp / 1000)) AS time_ISO_8601,
         rules.rulegroupid,
         rules.terminatingrule.ruleid AS ruleid,
         httprequest.uri,
         httprequest.httpmethod,
         httprequest.args,
         httprequest.headers
FROM prod_waf_logs, UNNEST(rulegrouplist) t(rules)
WHERE cardinality(nonterminatingmatchingrules) !=0
        AND from_unixtime(timestamp/1000) > now() - interval '1' day
ORDER BY  time_ISO_8601 ASC;

Let’s unravel this line by line.

First off we want the time to be human readable, as UNIX epoch isn’t really something I can read and understand.

Then we want to get some rule information. To get this we need to UNNEST the rulegrouplist and create a new pseudo set called rules. (Have a look at the FROM line. This is now in a state we can get to the struct data and use the . notation. We get the rulegroupid and ruleid (this is the rule group and rule that triggered the COUNT).

httprequest is a simple struct so we can grab the data out of that quite simply. A note here; I’m not totally happy with the result of headers and while I tried to turn it into a string I eventually decided this result was close enough for my use case.

Now the important WHERE clause.

cardinality(nonterminatingmatchingrules) !=0 is looking for rules that were matched, as if they were that array nonterminatingmatchingrules would have a rules in it, then we limit the result to the past day, manipulate as you will.

Lastly we order by time with the oldest at the top.

Result: WAF Results

Not too shabby.

© Greg Cockburn

Powered by Hugo & Kiss.