python - Pandas DataFrame filtering with condition on other DataFrame, which no longer works on version 0.20.x -
up until pandas version 0.19.2, below filtering worked
df = pd.dataframe(np.random.randn(6,5), columns=list('abcde')) df_filter = pd.dataframe(np.array([[1, 2, 10], [2, 1, 7], [1, 8, 3], [3, 9, 4], [1, 20, 2], [1, 4, 8]]), columns=list('bce')) df[df_filter < 5] and output of df below (since random)
b c d e 0 0.257206 0.510411 -0.548331 -0.083934 1.824593 1 -1.534482 -1.073950 0.639955 0.351071 -1.897773 2 0.749863 0.152933 -0.960877 1.162595 0.374817 3 -0.360232 0.479257 0.956225 -0.039248 0.381733 4 -0.519164 0.188241 0.614066 -0.356650 -0.886236 5 0.314688 -1.021030 0.689874 1.723714 -1.487867 also output of df_filter
b c e 0 1 2 10 1 2 1 7 2 1 8 3 3 3 9 4 4 1 20 2 5 1 4 8 and output of df[df_filter < 5]
b c d e 0 nan 0.510411 -0.548331 nan nan 1 nan -1.073950 0.639955 nan nan 2 nan 0.152933 nan nan 0.374817 3 nan 0.479257 nan nan 0.381733 4 nan 0.188241 nan nan -0.886236 5 nan -1.021030 0.689874 nan nan however, after pandas version 0.20.x, df[df_filter < 5] no longer works , raises exception.
is there other ways can same filtering did above pandas version 0.19.2 on version 0.20.x?
you might want reindex (df_filter < 5) mask
in [866]: df[(df_filter < 5).reindex(df.index, df.columns, fill_value=false)] out[866]: b c d e 0 nan -0.269032 -1.129067 nan nan 1 nan -0.048834 0.373961 nan nan 2 nan -0.210012 nan nan -0.763331 3 nan -0.767513 nan nan 1.016767 4 nan 0.255832 nan nan -1.494916 5 nan -1.364790 0.345673 nan nan
Comments
Post a Comment