Skip to content
Advertisement

Retrieve rows where Column1 != Multiple Values from Column2

I have 2 Location columns. I need to find a way to show only exceptions, where Location1 != Location2, BUT there are multiple values in Location2 column that can be associated with value in Lcoation1.

For example: San Diego in Location1 column can have 4 possible values in Location2 column (North, Central, South, San Diego) and such rows need to be excluded from the run. If San Diego in Location1 column and Seattle (for example) in Location2 column, that is considered as Exception.

create table Locations
(Id INT, Location1 VARCHAR(30), Location2 VARCHAR(30));

INSERT INTO Locations
VALUES
(1, 'San Diego', 'North'),
(2, 'San Diego', 'South'),
(3, 'San Diego', 'Central'),
(4, 'San Diego', 'Arizona'),
(5, 'San Diego', 'San Diego'),
(6, 'San Diego', 'Seattle'),
(7, 'San Diego', 'North'),
(8, 'San Diego', 'San Diego'),
(9, 'San Diego', 'Central'),
(10, 'San Diego', 'South')

SELECT * FROM Locations

I can exclude only obvious condition where Location 1 = Location 2:

SELECT *
FROM Locations
WHERE Location1 != Location2

Expected Output:

Id Lcoation1 Location2
4 San Diego Arizona
6 San Diego Seattle

Advertisement

Answer

If you are willing to do some manual (ongoing) maintenance, you could create a mapping table of valid Location1Location2 relations and filter Locations based on the mapping table. This way you define what is valid and you can easily find things that aren’t.

DROP TABLE IF EXISTS locations;
CREATE TABLE Locations
(Id INT, Location1 VARCHAR(30), Location2 VARCHAR(30));

DROP TABLE IF EXISTS RegionMap;
CREATE TABLE RegionMap (Location1 VARCHAR(30), Location2 VARCHAR(30));

INSERT dbo.RegionMap (Location1, Location2)
VALUES ('San Diego', 'North'),
('San Diego', 'South'),
('San Diego', 'Central'),
('San Diego', 'San Diego');

INSERT INTO Locations
VALUES
(1, 'San Diego', 'North'),
(2, 'San Diego', 'South'),
(3, 'San Diego', 'Central'),
(4, 'San Diego', 'Arizona'),
(5, 'San Diego', 'San Diego'),
(6, 'San Diego', 'Seattle'),
(7, 'San Diego', 'North'),
(8, 'San Diego', 'San Diego'),
(9, 'San Diego', 'Central');

SELECT l.Location1, l.Location2
FROM locations l
LEFT JOIN dbo.RegionMap m ON m.Location1 = l.Location1
                         AND m.Location2 = l.Location2
WHERE m.Location2 IS NULL;

Note here that I removed ID 10 (the 2nd instance of San Diego South) and you still get the proper results.

The drawback here is that you will have to put a process in place to maintain the mapping table. If you run this query routinely and get a regular report of the output it’ll help you to know what you might need to add to the mapping table.

But really, the best thing is to limit what the users can enter in as masterdata in the first place…

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement