Skip to content
Advertisement

Combine table data in MySQL using JOIN

I’m trying to join two tables in MySQL, in one I have a set of IDs (of the type GTEX-14BMU-1526-SM-5TDE6) and a set of type’s of tissue(SMTS), I have to select the IDs for the tissue type ‘Blood’ (which is another column of the same table), and then I have to take only the first two strings (GTEX-14BMU) from the ID name and make a list of the different ones.

Then I have to compare this to a second table in which I have a list of IDs that already are in the type (GTEX-14BMU) which have to meet the condition that the column sex of this same table is 2.

The expected result is a list with the IDs which are sex type 2 and have tissue type ‘Blood’, meaning the ones that are coinciding. I’m trying to solve this by using JOIN and all the needed conditions in the same statement, which is:

mysql> SELECT DISTINCT SUBSTRING_INDEX(g.SAMPID,'-',2) AS sampid, m.SUBJID, g.SMTS, m.SEX 
 -> FROM GTEX_Sample AS g 
 -> JOIN GTEX_Pheno AS m ON sampid=m.SUBJID 
 -> WHERE m.SEX=2 
 -> AND g.SMTS='Blood';

But I’m either getting too many results from the combination of all possibilities or I’m getting an empty set. Is there any other way to do this?

Advertisement

Answer

Here:

JOIN GTEX_Pheno AS m ON sampid=m.SUBJID

I suspect that your intent is to refer to the substring_index() expression that is defined in the select clause (which is aliased sampid as well). In SQL, you can’t reuse an alias defined in the select clause in the same scope (with a few exceptions, such as the ORDER BY clause, or the GROUP BY clause in MySQL). So the database thinks you are referring to column sampid of the sample table. If you had given a different alias (say sampid_short) and tried to use in the ON clause of the join, you would have met a compilation error.

You need to either repeat the expression, or use a subquery:

select substring_index(g.sampid, '-', 2) as sampid, m.subjid, g.smts, m.sex 
from gtex_sample as g 
inner join gtex_pheno as m on substring_index(g.sampid, '-', 2) = m.subjid 
where m.sex = 2 and g.smts = 'blood';
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement