Skip to content
Advertisement

Fill NA and update columns from another dataframe

I want to conditionally fill the missing and update the value from another dataframe.

I want to fill missing and update the data on column values in dataframe smalldf.

The condition is, if the value in B column (large df) is in the range of columns Range_FROM and Range_TO in (small df). Always choose the minimum records in (largedf) to fill or update the values in smalldf.

  • For example, in the second record of smalldf values (3), as there is a smaller value (1.3) in the largedf, it would end up being updated by a smaller value (1.3).
  • Another example, the third record of smalldf (Nan) is being filled by 1.6, as it is the smallest value within the range of Range_FROM (1.5) and Range_TO (1.6)

This is the first dataframe (smalldf):

RoadNo Range_FROM Range_TO values
A001 1.15 1.2 0.1
A001 1.35 1.4 3
A001 1.55 1.6 Nan
A001 1.75 1.8 0.1
A001 1.9 2 Nan

This is the second dataframe (largedf):

RoadNo B values
A001 1.1 0.2
A001 1.2 0.1
A001 1.3 1.9
A001 1.4 1.3
A001 1.5 1.6
A001 1.6 1.9
A001 1.7 0.2
A001 1.8 0.1
A001 1.9 1.9
A001 2 1.3

Below is the expected dataframe:

RoadNo Range_FROM Range_TO values
A001 1.15 1.2 0.1
A001 1.35 1.4 1.3
A001 1.55 1.6 1.6
A001 1.75 1.8 0.1
A001 1.9 2 1.3

Below is the code to create the two dataframes:

Please keep in mind that there are other RoadNo (e.g. A002, A003).

Both left join and fillna(df) in pandas did not produce my desired result. Is there a function in Pandas or SQL that can help me with this operation?

Thank you very much!

Advertisement

Answer

In SQL your problem would look like:

So we can do:

Output:


But since you actually want something a little different, let’s try:

Output:

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement