pandas - Auto join a python dataframe to update it -
i perform auto-join on python dataframe update it. here situation, have first df 3 columns: in, out & date. means @ specific date item "out" replaced "in".
import pandas pd import numpy np datetime import datetime data = [[1,10,"2017-01-01"],[2,10,"2017-01-01"],[10,11,"2017-06-01"],[4,14,"2017-04-01"],[5,14,"2017-12-01"]] label = ["out","in","date"] df = pd.dataframe(data,columns=label) df['date'] = pd.to_datetime(df['date']) print(df) out in date 0 1 10 2017-01-01 1 2 10 2017-01-01 2 10 11 2017-06-01 3 4 14 2017-04-01 4 5 14 2017-12-01 for example means here of first of jan 2017, item #1 replaced item #10. trick of june 2017, item #10 replaced item #11. #1 becomes #10 becomes #11.
now populate final table gives final relationships date.
if date = 2017-08-01, table
date = pd.to_datetime("2017-08-01") data = [[1,11],[2,11],[10,11],[4,14]] df_final = pd.dataframe(data,columns=["out","in"]) print(df_final) out in 0 1 11 1 2 11 2 10 11 3 4 14 would know how perform such auto join?
thanks,
you can use list comprehension methods , .loc locate values.
import pandas pd import numpy np datetime import datetime data = [[1,10,"2017-01-01"],[2,10,"2017-01-01"],[10,11,"2017-06-01"],[4,14,"2017-04-01"],[5,14,"2017-12-01"],[11,18,"2017-12-01"]] label = ["out","in","date"] df = pd.dataframe(data,columns=label) df['date'] = pd.to_datetime(df['date']) print(df) out in date 0 1 10 2017-01-01 1 2 10 2017-01-01 2 10 11 2017-06-01 3 4 14 2017-04-01 4 5 14 2017-12-01 5 11 18 2017-12-01 l=[] row in df.iterrows(): x = row[1]['out'] y = row[1]['in'] while y in df.out.values.tolist(): y = df.loc[df['out'] == y,'in'].iloc[0] l.append((x,y)) df2 = pd.dataframe(l, columns=['out', 'in']) print(df2) out in 1 18 2 18 10 18 4 14 5 14 11 18
Comments
Post a Comment