Skip to content
Advertisement

Extracting hashtags by sql

I have this problem – currently I am extracting hashtags in a C# application but now I want to move it to SQL Server.

In C# I have code using a regex:

var regex = new Regex(@"#w+");
var matches = regex.Matches(item.Contents).ToList();

In T-SQL, I would like to have something like this:

--I have table #Keywords like this:
CREATE TABLE #Keywords 
(
    Word nvarchar(400),
    Id int
)

INSERT INTO #Keywords VALUES ('This is #text1 with #hashtag1', 1);
INSERT INTO #Keywords VALUES ('This is #text2 with #hashtag2', 2);
INSERT INTO #Keywords VALUES ('This is #text3 with #hashtag3', 3);

SELECT * FROM #Keywords

-- In result I want table like this:
CREATE TABLE #HashtagsResult 
(
    Word nvarchar(400),
    Id int
)

INSERT INTO #HashtagsResult VALUES ('#text1', 1);
INSERT INTO #HashtagsResult VALUES ('#hashtag1', 1);
INSERT INTO #HashtagsResult VALUES ('#text2', 2);
INSERT INTO #HashtagsResult VALUES ('#hashtag2', 2);
INSERT INTO #HashtagsResult VALUES ('#text3', 3);
INSERT INTO #HashtagsResult VALUES ('#hashtag3', 3);

SELECT * FROM #HashtagsResult

Advertisement

Answer

Using the string_split

SELECT id, s.value 
FROM #Keywords
cross apply string_split(Word, ' ') s
where s.value like '#%%'

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement