Skip to content
Advertisement

Need Help Extracting Values from URL Link in SQLite

I am using DB Browser for SQLite.

I have a table called ‘df’ and a column with a bunch of URL links called ‘links’. All the links are from the same website and follow the same structured format:

https://’website name’/’language’/’type of content’/’content id’/’name of content’

Few examples of language: 1. en_au 2. fr_ca 3. en_us etc..

Also there is only one unique value in the ‘type of content’ portion of the link.

Goal: I need help extracting all the content ids from the link.

I have tried regex, but this pull all the information that matches the regex instead of the content id. I have also used the substr function (see code below), but this raises a few issues: 1. Some of the links don’t begin with ‘www.’ 2. Not all the content ids are the same character length. 3. Not all of the language ids are the same character length.

SELECT
    substr(links, 36, instr(links, '/') +1) AS content_id
FROM df

Advertisement

Answer

Figured it out. This is the answer for Sqlite:

SELECT
	substr(links, instr(links, 'type of content') + 11, 6) AS content_id
FROM df

if you have one type of content, like in my case i only have blog_posts, then you can use 10. The number 10 allows you to skip ahead of “blog_posts/” in the url link to the content_id. Since all my content ids are 6 characters long, I put in 6 to capture only 6 values.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement