Better solution to index a DataFrame according to the values of 2 others

Question

I would like to index a DataFrame (aaxx_df) according to the values of 2 others (val1_df for the columns and val2_df for the rows). I put below a solution that works for my problem, but I guess, there must be some much cleaner solutions, possibly via SQL (it seems to me to be very similar to a relational database problem).

Accepted Answer

If you want to index aaxx_df using the values from val2_df (rows) and val1_df (cols), use lookup:vals = aaxx_df.lookup(val2_df.values.flatten(), val1_df.values.flatten())    res = pd.DataFrame(data=vals.reshape(val1_df.shape), columns=val1_df.columns)print(res)Output    A   B0  10 -301 -20  20Since Dec 26, 2020, with pandas 1.2, the lookup method is deprecated. Here is a minimal version as a function:def lookup(df, row_labels, col_labels) -> np.ndarray:    """    Label-based "fancy indexing" function for DataFrame.    Given equal-length arrays of row and column labels, return an    array of the values corresponding to each (row, col) pair.    Parameters    ----------    row_labels : sequence        The row labels to use for lookup.    col_labels : sequence        The column labels to use for lookup.    Returns    -------    numpy.ndarray        The found values.    """    row_idx = df.index.get_indexer(row_labels)    col_idx = df.columns.get_indexer(col_labels)    flat_index = row_idx * len(df.columns) + col_idx    return df.values.flat[flat_index]The code becomes thus:vals = lookup(aaxx_df, val2_df.values.flatten(), val1_df.values.flatten())    res = pd.DataFrame(data=vals.reshape(val1_df.shape), columns=val1_df.columns)print(res)

Advertisement

Answer