python 2.7 - How to get the duration of a threshold breach in a pandas data frame? -
i'm looking suggested approach following time efficient in pandas. let's have dataframe looks this:
[timestamp] [val] 2017-08-19 22:28:42.000 151 2017-08-19 22:28:42.001 127 2017-08-19 22:29:42.000 149 2017-08-19 22:34:10.000 127 2017-08-19 22:35:10.000 126 2017-08-19 22:36:10.000 132 2017-08-19 22:37:10.000 129 2017-08-19 22:39:10.000 124
how duration when val exceeds 127?
so i'd expect answer of:
22:28:42 -> 22:28:42.001 22:29:42 -> 22:34:10.000 22:36:10 -> 22:39:10.000
i @ these date ranges , carry out actions like: how many datapoint there between dates value above 127
first sort data timestamp
>> df['timestamp'] = pd.to_datetime(df['timestamp']) >> df = df.sort_values('timestamp')
then find positions val changes lte or gt 127
>> df['changed'] = (df['val'] > 127).astype(int).diff().fillna(1).astype(int) >> df timestamp val changed 0 2017-08-19 22:28:42.000 151 1 1 2017-08-19 22:28:42.001 127 -1 2 2017-08-19 22:29:42.000 149 1 3 2017-08-19 22:34:10.000 127 -1 4 2017-08-19 22:35:10.000 126 0 5 2017-08-19 22:36:10.000 132 1 6 2017-08-19 22:37:10.000 129 0 7 2017-08-19 22:39:10.000 124 -1
above, particular timestamp
- -1 means val changed lte 127
- +1 means val changed gt 127
finally construct time intervals need
>> pd.dataframe({ >> 't_0': df.loc[df.changed == 1, 'timestamp'].reset_index(drop=true), >> 't_n': df.loc[df.changed == -1, 'timestamp'].reset_index(drop=true)}) t_n t_0 0 2017-08-19 22:28:42.001 2017-08-19 22:28:42 1 2017-08-19 22:34:10.000 2017-08-19 22:29:42 2 2017-08-19 22:39:10.000 2017-08-19 22:36:10
Comments
Post a Comment