python - How to generate a new Pandas dataframe where I compress some rows into a new column? -
i new pandas dataframe have been watching tutorials , reading documentation , cannot figure out way want. have dataframe indexed time stamps , want bucket period single row. graphically:
# start date of series start_date='20130101' # range of dates dates = pd.date_range(start_date, periods=6) # random dataframe df = pd.dataframe([["(1,1)","(1,2)"], ["(2,1)","(2,2)"], ["(3,1)","(3,2)"], ["(4,1)","(4,2)"], ["(5,1)","(5,2)"], ["(6,1)","(6,2)"]], index=dates, columns=list('ab')) print(df) # range of bucketing periods, in case 3 periods covering 2 days each rng = pd.period_range(start_date, periods=3,freq='2d')
this results in
b 2013-01-01 (1,1) (1,2) 2013-01-02 (2,1) (2,2) 2013-01-03 (3,1) (3,2) 2013-01-04 (4,1) (4,2) 2013-01-05 (5,1) (5,2) 2013-01-06 (6,1) (6,2)
what generate new dataframe have periods in rng = pd.period_range(start_date, periods=3,freq='2d')
indices , rows corresponding period consecutive columns:
b a1 b1 2013-01-01 (1,1) (1,2) (2,1) (2,2) 2013-01-03 (3,1) (3,2) (4,1) (4,2) 2013-01-05 (5,1) (5,2) (6,1) (6,2)
is there method in api can use this? imagine need generate new labels a1,b1.
also, after thought bit more, can with
a1 b b1 2013-01-01 (1,1) (2,1) (1,2) (2,2) 2013-01-03 (3,1) (4,1) (3,2) (4,2) 2013-01-05 (5,1) (6,1) (5,2) (6,2)
one of way converting periods timestamp
, making dataframe concating them filling nan
ffill
method, , reshape based on index setting new timestamp column index i.e
n = pd.dataframe(rng.to_timestamp()).set_index(rng.to_timestamp()) result = pd.concat([df, n], axis=1).fillna(method='ffill').set_index(0) result = result.set_index(result.groupby(level=0).cumcount(), append=true).unstack()
output
b 0 1 0 1 0 2013-01-01 (1,1) (2,1) (1,2) (2,2) 2013-01-03 (3,1) (4,1) (3,2) (4,2) 2013-01-05 (5,1) (6,1) (5,2) (6,2) in [1024]:
Comments
Post a Comment