sql | Monday, 11 October 2021
MySQL query to extract domains from URLs:
SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(target_url, '/', 3), '://', -1), '/', 1), '?', 1) AS domain
Explanation (cause non-trivial SQL rarely makes sense):
SUBSTRING_INDEX(target_url, '/', 3) - strips any path if the url has a protocol SUBSTRING_INDEX(THAT, '://', -1) - strips any protocol from THAT SUBSTRING_INDEX(THAT, '/', 1) - strips any path from THAT ( if there was no protocol ) SUBSTRING_INDEX(THAT, '?', 1) - strips the query string from THAT ( if there was no path or trailing / )