SQL Reference


Querying with SQL


Resurface is based on Presto, a powerful open-source SQL query engine.

For official documentation on Presto, these links are the most helpful:

The rest of this guide shows specific examples of SQL statements for Resurface data.

Message Table


The 'describe' command returns information about all available columns.


describe resurface."public/http".message

Retrieve all columns, just a few columns, or just a count of all rows.


select * from resurface."public/http".message limit 500
select request_url, response_code from resurface."public/http".message limit 500
select count(*) from resurface."public/http".message

Where Conditions


Search for messages that meet specific matching conditions.


where request_url like 'https://resurface.io%'
where request_url like '%/signup.post'
where response_body like '%REDtalks%'
where request_url like '%/signup.post' and response_body like '%REDtalks%'

JSON Functions


Retrieve or search on elements within JSON documents.


select json_extract_scalar(request_headers_json, '$["user-agent"]')
from resurface."public/http".message
where json_extract_scalar(request_headers_json, '$["user-agent"]') not like 'ELB%'
and strpos(json_extract_scalar(request_headers_json, '$["referer"]'), '//resurface.io') = 0

Date Functions


Convert between times and strings for searching & reporting.


select DATE_FORMAT(FROM_UNIXTIME(now / 1000),'%Y-%m-%d') as day, count(*) as total
from resurface."public/http".message
group by DATE_FORMAT(FROM_UNIXTIME(now / 1000),'%Y-%m-%d')
order by DATE_FORMAT(FROM_UNIXTIME(now / 1000),'%Y-%m-%d')

Pivoting on URL


Reporting which URLs are used most frequently.


select request_url, count(*) as total
from pulsar."public/surf".http_message
group by request_url
order by total desc

Pivoting on Request Headers


Reporting which user-agents are used most frequently. (excluding ELB traffic in this example)


select json_extract_scalar(request_headers_json, '$["user-agent"]') as user_agent, count(*) as total
from resurface."public/http".message
where json_extract_scalar(request_headers_json, '$["user-agent"]') not like 'ELB%'
group by json_extract_scalar(request_headers_json, '$["user-agent"]')
order by total desc

Reporting which referers are most active. (excluding internal referers in this example)


select json_extract(request_headers_json, '$["referer"]') as referer, count(*) as total
from resurface."public/http".message
where strpos(json_extract_scalar(request_headers_json, '$["referer"]'), 'resurface.io') = 0
group by json_extract(request_headers_json, '$["referer"]')
order by total desc

Pivoting on Request Parameters


Let's assume a POST form with three fields: email, platform and language.


select DATE_FORMAT(FROM_UNIXTIME(now / 1000),'%Y-%m-%d') as submitted,
json_extract(request_params_json, '$["email"]') as email,
json_extract(request_params_json, '$["platform"]') as platform,
json_extract(request_params_json, '$["language"]') as language
from resurface."public/http".message
where request_url like '%/signup.post' and request_method = 'POST'

Now summarize which platforms are used most frequently.


select json_extract(request_params_json, '$["platform"]') as platform, count(*) as total
from resurface."public/http".message
where request_url like '%/signup.post' and request_method = 'POST'
group by json_extract(request_params_json, '$["platform"]')
order by total desc