I’m trying to join two tables in MySQL, in one I have a set of IDs (of the type GTEX-14BMU-1526-SM-5TDE6) and a set of type’s of tissue(SMTS), I have to select the IDs for the tissue type ‘Blood’ (which is another column of the same table), and then I have to take only the first two strings (GTEX-14BMU) from the ID name and make a list of the different ones.
Then I have to compare this to a second table in which I have a list of IDs that already are in the type (GTEX-14BMU) which have to meet the condition that the column sex of this same table is 2.
The expected result is a list with the IDs which are sex type 2 and have tissue type ‘Blood’, meaning the ones that are coinciding. I’m trying to solve this by using JOIN and all the needed conditions in the same statement, which is:
mysql> SELECT DISTINCT SUBSTRING_INDEX(g.SAMPID,'-',2) AS sampid, m.SUBJID, g.SMTS, m.SEX -> FROM GTEX_Sample AS g -> JOIN GTEX_Pheno AS m ON sampid=m.SUBJID -> WHERE m.SEX=2 -> AND g.SMTS='Blood';
But I’m either getting too many results from the combination of all possibilities or I’m getting an empty set. Is there any other way to do this?
Advertisement
Answer
Here:
JOIN GTEX_Pheno AS m ON sampid=m.SUBJID
I suspect that your intent is to refer to the substring_index()
expression that is defined in the select
clause (which is aliased sampid
as well). In SQL, you can’t reuse an alias defined in the select
clause in the same scope (with a few exceptions, such as the ORDER BY
clause, or the GROUP BY
clause in MySQL). So the database thinks you are referring to column sampid
of the sample table. If you had given a different alias (say sampid_short
) and tried to use in the ON
clause of the join, you would have met a compilation error.
You need to either repeat the expression, or use a subquery:
select substring_index(g.sampid, '-', 2) as sampid, m.subjid, g.smts, m.sex from gtex_sample as g inner join gtex_pheno as m on substring_index(g.sampid, '-', 2) = m.subjid where m.sex = 2 and g.smts = 'blood';