sas - pick every first occurrence of the column value over a period of time -
i have data shown below.
msisdn date net_type 11111 01/01/2017 1 11111 02/01/2017 1 11111 03/01/2017 1 11111 04/01/2017 2 11111 05/01/2017 2 11111 06/01/2017 2 11111 07/01/2017 2 11111 08/01/2017 2 11111 09/01/2017 1 11111 10/01/2017 1 11111 11/01/2017 1 11111 12/01/2017 1 11111 13/01/2017 1 11111 14/01/2017 2 11111 15/01/2017 2 11111 16/01/2017 2
i looking solution create new variable helps pick every first , last occurrence of net_type on period of time (date values not fixed, net_type can 1 or 2 days or months or years). solution looking below.
msisdn date net_type indicator 11111 01/01/2017 1 1 11111 02/01/2017 1 0 11111 03/01/2017 1 1 11111 04/01/2017 2 1 11111 05/01/2017 2 0 11111 06/01/2017 2 0 11111 07/01/2017 2 0 11111 08/01/2017 2 1 11111 09/01/2017 1 1 11111 10/01/2017 1 0 11111 11/01/2017 1 0 11111 12/01/2017 1 0 11111 13/01/2017 1 1 11111 14/01/2017 2 1 11111 15/01/2017 2 0 11111 16/01/2017 2 1
if can provide solution sas helpful.
code i've tried doesn't work in sas:
select *, case when net_type <> coalesce(lag(net_type, 1) on ( partition sub_no order dt), 99) dt end starting, case when net_type <> coalesce(lag(net_type, 1) on ( partition sub_no order dt desc), 99) dt end ending table order dt
in sas simple if know enough data step by-group processing, , hard if don't. you've found, proc sql doesn't support partition by
syntax you're using. here's data step solution:
data have; input msisdn & date & ddmmyy10. net_type; format date ddmmyy10.; cards; 11111 01/01/2017 1 11111 02/01/2017 1 11111 03/01/2017 1 11111 04/01/2017 2 11111 05/01/2017 2 11111 06/01/2017 2 11111 07/01/2017 2 11111 08/01/2017 2 11111 09/01/2017 1 11111 10/01/2017 1 11111 11/01/2017 1 11111 12/01/2017 1 11111 13/01/2017 1 11111 14/01/2017 2 11111 15/01/2017 2 11111 16/01/2017 2 ; run; data want; set have; notsorted msisdn net_type; indicator = first.net_type or last.net_type; put (_all_) (=); run;
when have by
statement in data step, creates temporary first.x
, last.x
variables each variable in by
statement automatically looking ahead @ next row in input dataset.
usually by
statement requires dataset sorted, notsorted
option overrides , allows identify rows variable changes 1 value another, regardless of ordering.
Comments
Post a Comment