r - Head and tail by group -


i'm re-posting question again after confusion caused on part, apologies that. believe example correct.

sample data:

df <- data.frame(group=rep(c("a","b","c"),c(8,10,8)), size=c(rep(1000,5),rep(0,3),rep(2000,7),rep(0,3),rep(5000,5),rep(0,3)),          out=c(rep(0,5),rnorm(3,5,1),rep(0,7),rnorm(3,5,1),rep(0,5),rnorm(3,5,1)),          g1=rbinom(26,1,.5),g2=rbinom(26,1,.5),g3=rbinom(26,1,.5))      group size      out g1 g2 g3 1      1000 0.000000  0  0  1 2      1000 0.000000  0  1  0 3      1000 0.000000  0  1  0 4      1000 0.000000  0  1  0 5      1000 0.000000  0  0  1 6         0 3.997360  1  1  0 7         0 4.992823  1  0  1 8         0 5.644386  1  1  1 9      b 2000 0.000000  1  1  0 10     b 2000 0.000000  0  1  1 11     b 2000 0.000000  0  0  0 12     b 2000 0.000000  1  0  1 13     b 2000 0.000000  1  1  0 14     b 2000 0.000000  1  0  1 15     b 2000 0.000000  1  1  1 16     b    0 5.247895  1  0  0 17     b    0 5.248148  0  0  1 18     b    0 5.026844  1  1  1 19     c 5000 0.000000  0  0  0 20     c 5000 0.000000  0  1  0 21     c 5000 0.000000  0  1  1 22     c 5000 0.000000  0  0  0 23     c 5000 0.000000  1  0  1 24     c    0 6.532156  1  1  0 25     c    0 5.457338  0  0  0 26     c    0 4.675683  1  1  1 

i obtain this:

   group size      out  g1 g2 g3 1      1000 0.000000  1  1  1 6         0 7.276473  0  0  1 9      b 2000 0.000000  0  0  0 16     b    0 5.630425  1  0  0 19     c 5000 0.000000  0  0  0 24     c    0 5.449923  1  0  1 

and final output is:

   group size      out g1 g2 g3 6         0 7.276473  1  1  1 16     b    0 5.630425  0  0  0 24     c    0 5.449923  0  0  0 

basically replacing values of g1-g3 in first row (per group) values in second row per group. i'm looking base r solution.

the solution that:

1) select first row per group (row 1) if out==0 , size>0 , select first row per given group out!=0 , size==0 (row 2).

2) replace dummy's g1-g3 first row , replace second row per group.

3) keep last row per group.

here possible (partial) solution:

sol <- with(df, by(df, group, function(x) rbind(head(x[(x$size>0 & x$out==0), ],1),head(x[x$size==0 & x$out!=0, ],1)))) data.frame(do.call(rbind,sol),check.names=false) 

in order make reproducible example, when use rngs or sample, should set.seed().

set.seed(5175)  df <- data.frame(group=rep(c("a","b","c"),c(8,10,8)), size = c(rep(1000,5),rep(0,3),rep(2000,7),rep(0,3),rep(5000,5),rep(0,3)), out=c(rep(0,5),rnorm(3,5,1),rep(0,7),rnorm(3,5,1),rep(0,5),rnorm(3,5,1)), g1=rbinom(26,1,.5), g2=rbinom(26,1,.5), g3=rbinom(26,1,.5))  fun <- function(x){     <- min(which(x$size > 0 & x$out == 0))     tmp1 <- x[i, ]     <- min(which(x$size == 0 & x$out != 0))     tmp2 <- x[i, ]     tmp2[, 4:6] <- tmp1[, 4:6]     tmp2 }  res <- do.call(rbind, lapply(split(df, df$group), fun)) res 

Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -