I am learning more raw SQL after coming from years of Rails and other ORMs, so have quite a ways to learn how to do complex queries efficiently. What I’m wondering here is _how to find all users which are missing some fields and associations, and returning which fields/associations they are missing.
I have a rough idea of how to write this in SQL but not exact (for PostgreSQL).
I have something like this data model, a table of users, a table of “social media links”, and an association mapping the link to the user, so a user can have many social media links, but there could be more than one user associated with one link (i.e. organizations):
CREATE TABLE users (
id INT GENERATED BY DEFAULT AS IDENTITY,
slug VARCHAR(255) NOT NULL,
name VARCHAR(255),
description TEXT,
PRIMARY KEY (id)
)
CREATE TABLE sociallinks (
id INT GENERATED BY DEFAULT AS IDENTITY,
type VARCHAR(255),
value TEXT,
PRIMARY KEY (id)
)
CREATE TABLE usersociallinks (
id INT GENERATED BY DEFAULT AS IDENTITY,
user_id INTEGER REFERENCES users;
sociallink_id INTEGER REFERENCES sociallinks,
PRIMARY KEY (id)
)
The question is, how do you perform the query (with this pseudocode):
select from users
join on usersociallinks.user_id = users.id
join on sociallinks.id = usersociallinks.id
where name = null
or description = null
or missing linkedin? (sociallinks.type == linkedin)
or missing facebook? (sociallinks.type == facebook)
return slug from users table
return has_name = false if name is null from users table
return has_description = false if description is null from users table
return has_linkedin = false if linkedin is null from sociallinks table
return has_facebook = false if facebook is null from sociallinks table
In natural language, “select the users which either don’t have a name or a description, or are missing a linkedin or facebook link (sociallinks.value), and return what fields they are missing”.
I can do this the naive, long and convoluted way or querying one thing at a time, but I’m wondering how to do this efficiently, in possibly just one query (or as few queries as possible).
SELECT * FROM users
WHERE name IS NULL
OR description IS NULL
LIMIT 1
Fetch those, then do:
const record = await knex.raw(SQL)
const output = {}
if (!record.name) output.hasName = false
if (!record.description) output.hasDescription = false
return output
Then the next query is more complex, but I would do one at a time.
How do you do it somewhat efficiently and in as few queries as possible? Fetching with a limit of say 100 users per query.
Input data would be:
users:
id,slug,name,description
1,foo,Foo,Im a description
2,bar,,Im a description too
3,baz,,
4,hello,Hello,
5,world,,
6,food,Foo,Im a descriptiond
7,bard,,Im a description tood
8,bazd,asdf,fdsa
9,hellod,,
10,worldd,,A worldd description
sociallinks:
id,type,value
1,facebook,foo
2,facebook,bar
3,facebook,baz
4,facebook,hello
5,facebook,world
6,linkedin,foo
7,linkedin,bar
8,linkedin,baz
9,linkedin,hello
10,linkedin,world
usersociallinks:
id,user_id,sociallink_id
1,1,1
2,2,2
3,2,6
4,3,7
5,5,3
6,8,4
7,8,8
8,9,9
Output data would be:
user_id,slug,has_name,has_description,has_linkedin,has_facebook
1,foo,true,true,false,true
2,bar,false,true,true,true
3,baz,false,false,true,false
4,hello,true,false,false,false
5,world,false,false,false,true
6,food,true,true,false,false
7,bard,false,true,false,false
// 8 has everything so it is missing
9,hellod,false,false,true,false
10,worldd,false,true,false,false
Advertisement
Answer
EXISTS()
to the rescue:
SELECT u.id, u.slug
, (u.name > '' ) AS has_name
, (u.description> '' ) AS has_description
, EXISTS(SELECT 1 FROM usersociallinks sl JOIN sociallinks s ON s.id = usl.sociallink_i
WHERE sl.user_id = u.id AND s.type = 'facebook') AS has_facebook
, EXISTS(SELECT 1 FROM usersociallinks sl JOIN sociallinks s ON s.id = usl.sociallink_i
WHERE sl.user_id = u.id AND s.type = 'linkedin') AS has_linkedin
FROM users u
ORDER BY u.id
;