Skip to content
Advertisement

Join pandas dataframes based on different conditions

I´m using Pandas in Python and I’d like to join 2 dataframes. My first dataframe is:

id var date
1 ABCD 2019-01-01
1 ABCD 2017-06-01
1 ABCD 2016-06-01
2 ABCD 2016-01-01

The dataframe I want to join is:

var date
ABCD 2019-01-01
ABCD 2017-06-01
FJKL 2016-06-01
ABCD 2016-01-01

I want to join the firt dataframe by var and date but the condition is that the date of the second dataframe has to be smaller than the first. If I were in sql I would do:

   select 
   a.*,
   b.* 
   from fisrt_data a 
   join second_data b on a.var = b.var and a.date > b.date

Any ideas to implement with pandas?

Advertisement

Answer

There is merge and query:

(df1.merge(df2, on=['var'], suffixes=['_a','_b'])
    .query('date_a > date_b')
)

Output:

   id    var       date_a      date_b
1    1  ABCD   2019-01-01  2017-06-01
2    1  ABCD   2019-01-01  2016-01-01
5    1  ABCD   2017-06-01  2016-01-01
8    1  ABCD   2016-06-01  2016-01-01

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement